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Preface 


It has been customary in Cambridge for many years to include 
as part of the Mathematical Tripos a brief introductory course 
on the Theory of Numbers. This volume is a somewhat fuller 
version of the lecture notes attaching to the course as delivered 
by me in recent times. It has been prepared on the suggestion 
and with the encouragement of the University Press. 

The subject has a long and distinguished history, and indeed 
the concepts and problems relating to the theory have been 
instrumental in the foundation of a large part of mathematics. 
The present text describes the rudiments of the field in a simple 
and direct manner. It is very much to be hoped that it will serve 
to stimulate the reader to delve into the rich literature associated 
with the subject and thereby to discover some of the deep and 
beautiful theories that have been created as a result of numerous 
researches over the centuries. Some guides to further study are 
given at the ends of the chapters. By way of introduction, there 
is a short account of the Disquisitiones arithmeticae of Gauss, 
and, to begin with, the reader can scarcely do better than to 
consult this famous work. 

I am grateful to Mrs S. Lowe for her careful preparation of 
the typescript, to Mr P. Jackson for his meticulous subediting, 
to Dr D. J. Jackson for providing me with a computerized version 
of Fig. 8.1, and to Dr R. C. Mason for his help in checking the 
proof-sheets and for useful suggestions. 


Cambridge 1983 
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Introduction 


Gauss and number theory * 


Without doubt the theory of numbers was Gauss’ favourite sub- 
ject. Indeed, in a much quoted dictum, he asserted that Mathe- 
matics is the Queen of the Sciences and the Theory of Numbers 
is the Queen of Mathematics. Moreover, in the introduction to 
Eisenstein’s Mathematische Abhandlungen, Gauss wrote ‘The 
Higher Arithmetic presents us with an inexhaustible storehouse 
of interesting truths - of truths, too, which are not isolated but 
stand in the closest relation to one another, and between which, 
with each successive advance of the science, we continually 
discover new and sometimes wholly unexpected points of con- 
tact. A great part of the theories of Arithmetic derive an addi- 
tional charm from the peculiarity that we easily arrive by induc- 
tion at important propositions which have the stamp of sim- 
plicity upon them but the demonstration of which lies so deep 
as not to be discovered until after many fruitless efforts; and 
even then it is obtained by some tedious and artificial process 
while the simpler methods of proof long remain hidden from us.’ 

All this is well illustrated by what is perhaps Gauss' most 
profound publication, namely his Disquisitiones arithmeticae. 
It has been described, quite justifiably I believe, as the Magna 
Carta of Number Theory, and the depth and originality of 
thought manifest in this work are particularly remarkable con- 
sidering that it was written when Gauss was only about eighteen 
years of age. Of course, as Gauss said himself, not all of the 
subject matter was new at the time of writing, and Gauss 

* This article was originally prepared for a meeting of the British 
Society for the History of Mathematics held in Cambridge in 
1977 to celebrate the bicentenary of Gauss’ birth. 
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acknowledged the considerable debt that he owed to earlier 
scholars, in particular Fermat, Euler, Lagrange and Legendre. 
But the Disquisitiones arithmeticae was the first systematic 
treatise on the Higher Arithmetic and it provided the foundations 
and stimulus for a great volume of subsequent research which 
is in fact continuing to this day. The importance of the work 
was recognized as soon as it was published in 1801 and the first 
edition quickly became unobtainable; indeed many scholars of 
the time had to resort to taking handwritten copies. But it was 
generally regarded as a rather impenetrable work and it was 
probably not widely understood; perhaps the formal latin style 
contributed in this respect. Now, however, after numerous re- 
formulations, most of the material is very well known, and the 
earlier sections at least are included in every basic course on 
number theory. 

The text begins with the definition of a congruence, namely 
two numbers are said to be congruent modulo n if their difference 
is divisible by n. This is plainly an equivalence relation in the 
now familiar terminology. Gauss proceeds to the discussion of 
linear congruences and shows that they can in fact be treated 
somewhat analogously to linear equations. He then turns his 
attention to power residues and introduces, amongst other things, 
the concepts of primitive roots and indices; and he notes, in 
particular, the resemblance between the latter and the ordinary 
logarithms. There follows an exposition of the theory of quad- 
ratic congruences, and it is here that we meet, more especially, 
the famous law of quadratic reciprocity; this asserts that if p, q 
are primes, not both congruent to 3 (mod 4), then p is a residue 
or non-residue of q according as q is a residue or non-residue 
of p, while in the remaining case the opposite occurs. As is well 
known, Gauss spent a great deal of time on this result and gave 
several demonstrations; and it has subsequently stimulated much 
excellent research. In particular, following works of Jacobi, 
Eisenstein and Kummer, Hilbert raised as the ninth of his famous 
list of problems presented at the Paris Congress of 1900 the 
question of obtaining higher reciprocity laws, and this led to 
the celebrated studies of Furtwangler, Artin and others in the 
context of class field theory. 


By far the largest section of the Disquisitiones arithmeticae is 
concerned with the theory of binary quadratic forms. Here Gauss 
describes how quadratic forms with a given discriminant can 
be divided into classes so that two forms belong to the same 
class if and only if there exists an integral unimodular substitu- 
tion relating them, and how the classes can be divided into 
genera, so that two forms are in the same genus if and only if 
they are rationally equivalent. He proceeds to apply these con- 
cepts so as, for instance, to throw light on the difficult question 
of the representation of integers by binary forms. It is a remark- 
able and beautiful theory with many important ramifications. 
Indeed, after re-interpretation in terms of quadratic fields, it 
became apparent that it could be applied much more widely, 
and in fact it can be regarded as having provided the foundations 
for the whole of algebraic number theory. The term Gaussian 
field, meaning the field generated over the rationals by i, is a 
reminder of Gauss’ pioneering work in this area. 

The remainder of the Disquisitiones arithmeticae contains 
results of a more miscellaneous character, relating, for instance, 
to the construction of seventeen-sided polygons, which was 
clearly of particular appeal to Gauss, and to what is now termed 
the cyclotomic field, that is the field generated by a primitive 
root of unity. And especially noteworthy here is the discussion 
of certain sums involving roots of unity, now referred to as 
Gaussian sums, which play a fundamental role in the analytic 
theory of numbers. 

I conclude this introduction with some words of Mordell. In 
an essay published in 1917 he wrote ‘The theory of numbers is 
unrivalled for the number and variety of its results and for the 
beauty and wealth of its demonstrations. The Higher Arithmetic 
seems to include most of the romance of mathematics. As Gauss 
wrote to Sophie Germain, the enchanting beauties of this sublime 
study are revealed in their full charm only to those who have 
the courage to pursue it.' And Mordell added ‘We are reminded 
of the folk-tales, current amongst all peoples, of the Prince 
Charming who can assume his proper form as a handsome prince 
only because of the devotedness of the faithful heroine.’ 
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Divisibility 


1 Foundations 

The set 1, 2, 3, . . . of all natural numbers will be denoted 
by N. There is no need to enter here into philosophical questions 
concerning the existence of N. It will suffice to assume that it is 
a given set for which the Peano axioms are satisfied. They imply 
that addition and multiplication can be defined on N such that 
the commutative, associative and distributive laws are valid. 
Further, an ordering on N can be introduced so that either m < n 
or n<m for any distinct elements m, n in N. Furthermore, 
it is evident from the axioms that the principle of mathe- 
matical induction holds and that every non-empty subset of N 
has a least member. We shall frequently appeal to these 
properties. 

As customary, we shall denote by Z the set of integers 
0, ±1, ±2, . . . , and by Q the set of rationals, that is the numbers 
p/q with p in Z and q in N. The construction, commencing 
with N, of Z, Q and then the real and complex numbers R and 
C forms the basis of Mathematical Analysis and it is assumed 
known. 

2 Division algorithm 

Suppose that a , b are elements of N. One says that b 
divides a (written b\a) if there exists an element c of N such 
that a = be. In this case b is referred to as a divisor of a, and a 
is called a multiple of b. The relation b\a is reflexive and transi- 
tive but not symmetric; in fact if b\a and a\b then a-b. Clearly 
also if b\a then b^a and so a natural number has only finitely 
many divisors. The concept of divisibility is readily extended 
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to Z; if a, b are elements of Z, with b^O, then b is said to divide 
a if there exists c in Z such that a = be. 

We shall frequently appeal to the division algorithm. This 
asserts that for any a , b in Z, with b > 0, there exist q, r in Z 
such that a = bq + r and 0^r<b. The proof is simple; indeed if 
bq is the largest multiple of b that does not exceed a then the 
integer r = a- bq is certainly non-negative and, since b(q + 1) > 
a, we have r<b. The result plainly remains valid for any integer 
b 5* 0 provided that the bound r< b is replaced by r < |h|. 

3 Greatest common divisor 

By the greatest common divisor of natural numbers a, 
b we mean an element d of N such that d\a, d\b and every 
common divisor of a and b also divides d. We proceed to prove 
that a number d with these properties exists; plainly it will be 
unique, for any other such number d' would divide a, b and so 
also d, and since similarly d\d' we have d = d\ 

Accordingly consider the set of all natural numbers of the 
form ax + by with x, y in Z. The set is not empty since, for 
instance, it contains a and b; hence there is a least member d, 
say. Now d = ax + by for some integers x, y, whence every com- 
mon divisor of a and b certainly divides d. Further, by the 
division algorithm, we have a-dq + r for some q, r in Z with 
0 <r<d; this gives r = ax'+by\ where x’-l-qx and y'--qy. 
Thus, from the minimal property of d, it follows that r = 0 
whence d\a. Similarly we have d\b, as required. 

It is customary to signify the greatest common divisor of a, b 
by (a, b). Clearly, for any n in N, the equation ax + by = n is 
soluble in integers x, y if and only if (a, b) divides n. In the case 
(a, b) = 1 we say that a and b are relatively prime or coprime 
(or that a is prime to b). Then the equation ax + by = n is always 
soluble. 

Obviously one can extend these concepts to more than two 
numbers. In fact one can show that any elements a lt ...,a m of 
N have a greatest common divisor d = (a lt . . . , a m ) such that 

d~a y x x + + a m x m for some integers x lt . . . , x m . Further, if 

d- 1, we say that a u ...,a m are relatively prime and then the 
equation a,x 1 + +a m x m = n is always soluble. 


4 Euclid’s algorithm 

A method for finding the greatest common divisor d of 
a, b was described by Euclid. It proceeds as follows. 

By the division algorithm there exist integers q lt ri such that 
a = bq y + fi and 0 <r y <b. If r } * 0 then there exist integers q 2 , 
r 2 such that b = r x q 2 + r 2 and 0^ r 2 <r v If r 2 ^0 then there exist 
integers q 3 , r 3 such that r, = r 2 q 3 + r 3 and 0^ r 3 < r 2 . Continuing 
thus, one obtains a decreasing sequence r,, r 2 , . . . satisfying f>_ 2 ~ 
rj-iqj + ff. The sequence terminates when rk + |=0 for some k t 
that is when r*_ j = r*q* +1 . It is then readily verified that d = r k . 
Indeed it is evident from the equations that every common 
divisor of a and b divides r Jf r 2 , . . . , r*; and moreover, viewing 
the equations in the reverse order, it is clear that r* divides each 
T} and so also b and a. 

Euclid's algorithm furnishes another proof of the existence of 
integers x, y satisfying d = ax + by , and furthermore it enables 
these x, y to be explicitly calculated. For we have d-r * and 
Tf — whence the required values can be obtained by 

successive substitution. Let us take, for example, <1 = 187 and 
b = 35. Then, following Euclid, we have 

187 = 35-5+12, 35=12-2+11, 12=11 -1 + 1. 

Thus we see that (187,35)= 1 and moreover 

1 = 12-11 -1 = 12-(35- 12 -2) = 3(187-35 -5)-35. 

Hence a solution of the equation 187x + 35t/= 1 in integers x, y 
is given by x = 3, y = -16. 

There is a close connection between Euclid's algorithm and 
the theory of continued fractions; this will be discussed in 
Chapter 6. 

5 Fundamental theorem 

A natural number, other than 1, is called a prime if it is 
divisible only by itself and 1. The smallest primes are therefore 
given by 2, 3, 5, 7, 11, 

Let n be any natural number other than 1. The least divisor 
of n that exceeds 1 is plainly a prime, say p x . If n*p x then, 
similarly, there is a prime pz dividing n/p,. If n ^ p, p 2 then 
there is a prime p 3 dividing n/p, p 2 ; and so on. After a finite 
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number of steps we obtain n = p t — p m ; and by grouping 
together we get the standard factorization (or canonical 
decomposition) n = p t Jl Pk ik , where Pi,...,p& denote dis- 

tinct primes and j\, ■ • ■ , jk are elements of N. 

The fundamental theorem of arithmetic asserts that the above 
factorization is unique except for the order of the factors. To 
prove the result, note first that if a prime p divides a product 
mn of natural numbers then either p divides m or p divides n. 
Indeed if p does not divide m then ( p, m) = 1 whence there exist 
integers x, y such that px + my — 1; thus we have pnx + mny = n 
and hence p divides n. More generally we conclude that if p 
divides — n k then p divides n t for some l. Now suppose 
that, apart from the factorization n = p x i% - - - Pk ik derived above, 
there is another decomposition and that p' is one of the primes 
occurring therein. From the preceding conclusion we obtain 
p' = pi for some l. Hence we deduce that, if the standard factoriz- 
ation for n/p' is unique, then so also is that for n. The funda- 
mental theorem follows by induction. 

It is simple to express the greatest common divisor (a, b) of 
elements a, b of N in terms of the primes occurring in their 
decompositions. In fact we can write a = p,"' — p k ak and b- 
Pi P ' ' ’ ' Pk Pk , where p x , . . . , Pk are distinct primes and the as 
and ps are non-negative integers; then (a, b) — p\ yi — p k h , 
where yi = min (a ; , /3 f ). With the same notation, the lowest com- 
mon multiple of a, b is defined by {a, b} = Pi* 1 - • • Pk k , where 
5, = max (a,, Pi). The identity (a, b){a, b} = ab is readily verified. 

6 Properties of the primes 

There exist infinitely many primes, for if Pi, . . . , p„ is 
any finite set of primes then Pi " * P„ + 1 is divisible by a prime 
different from p u . . . , p„; the argument is due to Euclid. It 
follows that, if p„ is the nth prime in ascending order of magni- 
tude, then p m divides Pi " • p„ + 1 for some m ^ n + 1; from this 
we deduce by induction that p„ < 2 2 . In fact a much stronger 
result is known; indeed p„~nlogn as n-*oo.t The result is 
equivalent to the assertion that the number ir(x) of primes p^x 
satisfies ir(x)~x / log x as x This is called the prime-number 

t The notation f~g means that fig-* 1; and one says that / is 
asymptotic to g. 


theorem and it was proved by Hadamard and de la Vallee Poussin 
independently in 1896. Their proofs were based on properties 
of the Riemann zeta-function about which we shall speak in 
Chapter 2. In 1737 Euler proved that the series £ l/p n diverges 
and he noted that this gives another demonstration of the 
existence of infinitely many primes. In fact it can be shown by 
elementary arguments that, for some number c, 

X l/p = loglogx + c + 0(l/logx). 

Fermat conjectured that the numbers 2 2 " + 1 (n - 1, 2, . . .) are 
all primes; this is true for n = 1, 2, 3 and 4 but false for n = 5, as 
was proved by Euler. In fact 641 divides 2 32 +l. Numbeis of 
the above form that are primes are called Fermat primes. They 
are closely connected with the existence of a construction of a 
regular plane polygon with ruler and compasses only. In fact 
the regular plane polygon with p sides, where p is a prime, is 
capable of construction if and only if p is a Fermat prime. It is 
not known at present whether the number of Fermat primes is 
finite or infinite. 

Numbers of the form 2" - 1 that are primes are called Mersenne 
primes. In this case n is a prime, for plainly 2 m -l divides 2 n - 1 
if m divides n. Mersenne primes are of particular interest in 
providing examples of large prime numbers; for instance it is 
known that 244 497 _ i {5 

the 27th Mersenne prime, a number with 

13 395 digits. 

It is easily seen that no polynomial f(n) with integer 
coefficients can be prime for all n in N, or even for all sufficiently 
large n, unless / is constant. Indeed by Taylor’s theorem, 
/(m/(n) + n) is divisible by f(n) for all m in N. On the other 
hand, the remarkable polynomial n 2 — n + 41 is prime for n = 
1,2, ..., 40. Furthermore one can write down a polynomial 
/(rii, . . . , n*) with the property that, as the run through the 
elements of N, the set of positive values assumed by / is precisely 
the sequence of primes. The latter result arises from studies in 
logic relating to Hilbert’s tenth problem (see Chapter 8). 

The primes are well distributed in the sense that, for every 
n> 1, there is always a prime between n and 2n. This result, 
which is commonly referred to as Bertrand’s postulate, can be 
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regarded as the forerunner of extensive researches on the differ- 
ence p n+l -p n of consecutive primes. In fact estimates of the 
form p n+x — p n = Of p„) are known with values of k just a little 
greater than g; but, on the other hand, the difference is certainly 
not bounded, since the consecutive integers nl + m with m = 
2, 3 are all composite. A famous theorem of Dirichlet 
asserts that any arithmetical progression a, a + q , a + 2 q , . . . , 
where (a, q)~ 1, contains infinitely many primes. Some special 
cases, for instance the existence of infinitely many primes of the 
form 4n + 3, can be deduced simply by modifying Euclid’s 
argument given at the beginning, but the general result lies quite 
deep. Indeed Dirichlet’s proof involved, amongst other things, 
the concepts of characters and L-functions, and of class numbers 
of quadratic forms, and it has been of far-reaching significance 
in the history of mathematics. 

Two notorious unsolved problems in prime-number theory 
are the Goldbach conjecture, mentioned in a letter to Euler of 
1742, to the effect that every even integer (>2) is the sum of two 
primes, and the twin-prime conjecture, to the effect that there 
exist infinitely many pairs of primes, such as 3, 5 and 17, 19, 
that differ by 2. By ingenious work on sieve methods, Chen 
showed in 1974 that these conjectures are valid if one of the 
primes is replaced by a number with at most two prime factors 
(assuming, in the Goldbach case, that the even integer is 
sufficiently large). The oldest known sieve, incidentally, is due 
to Eratosthenes. He observed that if one deletes from the set of 
integers 2, 3, . . . , n, first all multiples of 2, then all multiples of 
3, and so on up to the largest integer not exceeding Vn, then 
only primes remain. Studies on Goldbach’s conjecture gave rise 
to the Hardy-Littlewood circle method of analysis and, in par- 
ticular, to the celebrated theorem of Vinogradov to the effect 
that every sufficiently large odd integer is the sum of three primes. 

7 Further reading 

For a good account of the Peano axioms see E. Landau, 
Foundations of analysis (Chelsea Publ. Co., New York, 1951). 

The division algorithm, Euclid’s algorithm and the funda- 
mental theorem of arithmetic are discussed in every elementary 
text on number theory. The tracts are too numerous to list here 
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but for many years the book by G. H. Hardy and E. M. Wright, 

An introduction to the theory of numbers (Oxford U.P., 5th edn, 

1979) has been regarded as a standard work in the field. The 
books of similar title by T. Nagell (Wiley, New York, 1951) and 
H. M. Stark (MIT Press, Cambridge, Mass., 1978) are also to be 
recommended, as well as the volume by E. Landau, Elementary 
number theory (Chelsea Publ. Co., New York, 1958). 

For properties of the primes, see the book by Hardy and Wright 
mentioned above and, for more advanced reading, see, for inst- 
ance, H. Davenport, Multiplicative number theory (Springer- 
Verlag, Berlin, 2nd ed, 1980) and H. Halberstam and H. E. 

Richert, Sieve methods (Academic Press, London and New 
York, 1974). The latter contains, in particular, a proof of Chen’s 
theorem. The result referred to on a polynomial in several vari- 
ables representing primes arose from work of Davis, Robinson, 

Putnam and Matiyasevich on Hilbert’s tenth problem; see, for 
instance, the article in American Math. Monthly 83 (1976), 

449-64, where it is shown that 12 variables suffice. j 

i 

8 Exercises 

(i) Find integers x, y such that 95x + 432t/ = 1. f 

(ii) Find integers x, y, z such that 35x + 55y + 77z = 1. 

(iii) Prove that 1 +!+•*• + 1/n is not an integer for n> 1. | 

(iv) Prove that j 

({a, b}, {b, c }, {c, o}) = {(a, b), ( b , c), (c, a)}. I 

(v) Prove that if g lt g z , . . . are integers >1 then every 1 

natural number can be expressed uniquely in the form j 

ao+aigi+a 2 gi£ 2 +- * +«*£» ‘ “ gk, where the a, are S 

integers satisfying 0< fly <g y+ j . j 

(vi) Show that there exist infinitely many primes of the 
form 4n + 3. 

(vii) Show that, if 2" + 1 is a prime then it is in fact a 
Fermat prime. 

(viii) Show that, if m > n, then 2 2 + 1 divides 2 2 — 1 and so 
(2 2M + 1,2 2 " + 1) = 1. 

(ix) Deduce that p n+J < 2 2 " + 1, whence tt(x)> log log x for 
x>2. 
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1 The function [*] 

For any real x, one signifies by [*] the largest integer 
s x , that is, the unique integer such that x-1 <[*]<*. The 
function is called ‘the integral part of x\ It is readily verified 
that [* + {/]>[*] + [ and that, for any positive integer n, [x + n] = 
[*]+n and [x/n] = [[*]/«]. The difference x-[x ] is called ‘the 
fractional part of it is written {*} and satisfies 0<{x}< 1. 

Let now p be a prime. The largest integer l such that p l divides 
n! can be neatly expressed in terms of the above function. In 
fact, on noting that [ n/p ] of the numbers 1, 2, . . . , n are divis- 
ible by p, that [n/p 2 ] are divisible by p z , and so on, we obtain 

ft OO OO fl CO 

i i 1=1 Z 1=1 [n/p 1 ]. 

m-l i-1 j-1 m-1 J-l 

P l l”» p*|m 

It follows easily that l < [n/( p- 1)] ; for the latter sum is at most 
n(l/p + l/p 2 + - • •). The result also shows at once that the 
binomial coefficient 

( m\ ml 

n) nl(m-n)! 

is an integer; for we have 

["*/ P J ] ^[n/p J ] + [(m- n)/p J ], 

Indeed, more generally, if n,, . . . , n* are positive integers such 
that «! + • • • + «* = m then the expression m\/iny\ • • • n*l) is an 
integer. 




) 
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2 Multiplicative functions 

A real function / defined on the positive integers is said 
tobemultiplicativeif/(m)/(n) = /(mn)forail m, n with(m, n) = 

1. We shall meet many examples. Plainly if / is multiplicative 
and does not vanish identically then /(1) = 1. Further if n = 

Pi 1 ’ ’ ' Pk ik in standard form then 

/<»)=/(p.M ••/<?*'*). 

Thus to evaluate / it suffices to calculate its values on the prime 
powers; we shall appeal to this property frequently. 

We shall also use the fact that if / is multiplicative and if 

g(n)=Z/(d), 

d\n 

where the sum is over all divisors d of n, then g is a multiplicative 
function. Indeed, if (m, n)= 1, we have 

g(mn)= I l /(dd')= Z f{d) Z fid') 

d\m d In d\m d'\n 

— gim)gin). 

3 Euler’s (totient) function <^(n) j 

By <f>in) we mean the number of numbers 1, 2, . . . , n ' 

that are relatively prime to n. Thus, in particular, ^(1) = <M2) = 1 j 

and ^(3) = 0(4) = 2. j 

We shall show, in the next chapter, from properties of con- ? 

gruences, that 0 is multiplicative. Now, as is easily verified, ‘ 

0( p 1 ) = p 1 — p l ~ l for all prime powers p f . It follows at once that « 

0(n) = n [] (1-1/p). 

p\n 

♦ 

We proceed to establish this formula directly without assuming ( 

that 0 is multiplicative. In fact the formula furnishes another \ 

proof of this property. 

Let pi, . . . , p* be the distinct prime factors of n. Then it suffices 
to show that 0(n) is given by 

n-Z("/Pr)+ Z n/ip r P')- Z n/(PrP.P») + - * ’ • 

r r>» r>*>! 

But n/p r is the number of numbers 1, 2, . . . , n that are divisible | 

by p r , n/i p r p t ) is the number that are divisible by p r p„ and so 
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on. Hence the above expression is 

i (i- 1 1+ z i- - 

m *= 1 \ r r>* 

Pr\m PrP.|m 



where is the number of primes Pi, . . . , p* that divide 

m. Now the summand on the right is (1 — 1)* = 0 if l> 0, and it j 

is 1 if 1 = 0. The required result follows. The demonstration is j 

a particular example of an argument due to Sylvester. * 

It is a simple consequence of the multiplicative property of 
<f> that 

I =n. 

d\n : 

In fact the expression on the left is multiplicative and, when 1 

n = p* f it becomes 

0(1 ) + 0(p) + ‘ * • + <f>(p i ) 

-l + (pM) + --- + (p'-p'“ 1 )-p'. 

4 The Mobius function p(n) 

This is defined, for any positive integer n, as 0 if n 

contains a squared factor, and as (— 1)* if n = Pi p* as a 

product of k distinct primes. Further, by convention, p( 1)= 1. j 

It is clear that p is multiplicative. Thus the function 
v{n)= £ p(d) 

d\n 

is also multiplicative. Now for all prime powers p 1 with j> 0 
we have v(p i ) = p(l) + p(p) = 0. Hence we obtain the basic , 

property, namely v{n) = 0 for n > 1 and v(l) = 1. We proceed to ' 

use this property to establish the Mobius inversion formulae. 

Let / be any arithmetical function, that is a function defined 
on the positive integers, and let ) 

g(n)=lf(d). | 

d\n 

Then we have l 

/(«) = I p(d)g(n/d). } 
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In fact the right hand side is 

l I p(d)f(d')= £ f(d’)v{n/ d') t 

d\n d"\n/d d'\n 

and the result follows since v(n/d') = 0 unless d' = n. The con- 
verse also holds, for we can write the second equation in the form 

/(«)= I p(n/d')g(d') | 

d'\n 

and then 

I f(d)= I /(n/d)= l l M (n /dd')g(d') ] 

d\n d\n d |n d'\n/d * 

= I g(d'Mn/d'). 

! 

I 

Again we have v(n/d') = 0 unless d'= n, whence the expression 
on the right is g(n). 

The Euler and Mobius functions are related by the equation I 

0(«) = n £ p(d)/d. 

d\n \ 

i 

This can be seen directly from the formula for <f> established in j 

§ 3, and it also follows at once by Mobius inversion from the 
property of <f> recorded at the end of § 3. Indeed the relation is 
clear from the multiplicative properties of <f> and p. ] 

There is an analogue of Mobius inversion for functions defined 
over the reals, namely if 

g(*)= I f(x/n) 

nsi 

then 

f(x)= £ p(n)g(x/n). j 

nsx f 

In fact the last sum is 

I I p(n)f(x/mn)= £ f(x/l)v(l) 

ns* ms*/n Is* 

and the result follows since v(l) = 0 for / > 1 . We shall give several j 

applications of Mobius inversion in the examples at the end of ■! 

the chapter. \ 

5 The functions t ( n) and <r(n) 

For any positive integer n, we denote by r(n) the number 
of divisors of n (in some books, in particular in that of Hardy \ 

and Wright, the function is written d(n)). By <r(n) we denote > 
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the sum of the divisors of n. Thus 
T(n)=Il, <r(n)=Id. 

d\n d\n 

It is plain that both r(n) and <r(n) are multiplicative. Further, 
for any prime power p i we have r(p i ) = ,/ + l and 

<r(p i ) = l + p + - • - + p f = ( p i+i - l)/(p-l). 

. i 

Thus if p* is the highest power of p that divides n then 

r(n)= n (i + 1), <r(n)= fl (p /+1 -l)/(p-l). i 

pj" p\n j 

It is easy to give rough estimates for the sizes of r(n) and <r(n). t 

Indeed we have r(n)<cn* for any 6>0, where c is a number 
depending only on 8; for the function /(n) = T(n)/n* is multi- 
plicative and satisfies f(p*) = (j + l)/p i * <1 for all but a finite 
number of values of p and /, the exceptions being bounded in 
terms of 8. Further we have L 

cr(n) = n £ l/d^n £ l/d< n(l + log n). 

d\n dsn 

The last estimate implies that <fr(n)>\n / log n for n> 1. In fact 
the function /(n) = cr(n)0(n)/n 2 is multiplicative and, for any 
prime power p\ we have 

f(p*) = l-p-t-'* 1-1/p 2 ; 
hence, since 

nu-l/p 2 )* n (l-l/m 2 ) = |, ; 

p|n m -2 J 

it follows that a(n)^(n)s=|n 2 , and this together with <r(n)< j 

2n log n for n > 2 gives the estimate for (f>. 

6 Average orders 

It is often of interest to determine the magnitude ‘on > 

average’ of arithmetical functions /, that is, to find estimates for s 

sums of the form £ f(n) with n ^ x, where x is a large real number. 

We shall obtain such estimates when / is r, a and <f>. 

First we observe that ) 

I r(n)= £ £ 1= £ £ 1= £ [x/d). 

nsi nsi d\n dsx m s x/d dsx 

Now we have 

£ 1/d = log x + 0(1), 

dsx 


i 
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and hence 

£ r(n) = x log x + 0(x). 

nsx 

This implies that (1/x) £ r(n) ~ log x as x -* oo. The argument can 
be refined to give 

£ r(n) = x log x + (2y-l)x + 0(\/x), 

tlSl 

where y is Euler’s constant. Note that although one can say that 
the ‘average order’ of r(n) is log n (since £ log n ~ x log x), it is 
not true that ‘almost all' numbers have about log n divisors; here 
almost all numbers are said to have a certain property if the 
proportion ^ x not possessing the property is o(x). In fact ‘almost 
all’ numbers have about (log n) ,0 * z divisors, that is, for any e > 0 
and for almost all n, the function r(n)/(log n) ,OK2 lies between 
(log n)‘ and (log n)~‘. 

To determine the average order of a(n) we observe that 
Z <T(n)= £ £(«/«*)« £ £ m. 

n«t nsi d |n dsx msx/d 

The last sum is 

\[x/d]([x/d] + 1) = k(x/d) 2 + 0(x/d). 

Now 

I l/d 2 = I l/d 2 +0(l/x), 

dsx d-1 

and thus we obtain 

I (r(n) = -^-Tr z x 2 +0(x log x). 

This implies that the ‘average order’ of o(n) is &n 2 n (since 
ln~~kx 2 ). 

Finally we derive an average estimate for <t>. We have 
I <Mn)= I I fi(d)(n/d)= £ p(d) £ m. 

ns* ns* d\n dsx msx/d 

The last sum is 

|(x/d) 2 + 0(x/d). 

Now 

I P(d)/d 2 = £ p(d)/d 2 +0( 1/x), 

dsx d - 1 

and the infinite series here has sum 6/ ir 2 , as will be clear from 
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§ 8. Hence we obtain 

X <£(n) = (3/ir 2 )x 2 + 0(x log x). 

This implies that the ‘average order' of 4>(n) is 6 n/ it 2 . Moreover ^ 

the result shows that the probability that two integers be rela- 
tively prime is 6/ir 2 . For there are |n(n + l) pairs of integers p, 
q with 1 < p < q < n, and precisely <f>( 1)+ • • • + <f>(n) of the corres- 
ponding fractions p/q are in their lowest terms. 

7 Perfect numbers 

A natural number n is said to be perfect if <r(n) = 2n, 
that is if n is equal to the sum of its divisors other than itself. 

Thus, for instance, 6 and 28 are perfect numbers. 

Whether there exist any odd perfect numbers is a notorious 
unresolved problem. By contrast, however, the even perfect 
numbers can be specified precisely. Indeed an even number is 
perfect if and only if it has the form 2 P-1 (2 P - 1), where both p 
and 2 P - 1 are primes. It suffices to prove the necessity, for it is 
readily verified that numbers of this form are certainly perfect. 

Suppose therefore that cr(n) = 2n and that n = 2 *m, where k and 
m are positive integers with m odd. We have (2 fc+, -l)cr(m) = 

2 k+l m and hence a(m) = 2 k+l l and m = (2* +1 -l)Z for some posi- 
tive integer l. If now l were greater than 1 then m would have 
distinct divisors l, m and 1, whence we would have tr(m)> [ 

l + m + l. But l + m = 2 * +l / = cr(m), and this gives a contradiction. ' 

Thus / = 1 and cr(m)= m + 1, which implies that m is a prime. 

In fact m is a Mersenne prime and hence k + 1 is a prime p, say 
(cf. § 6 of Chapter 1). This shows that n has the required form. 

8 The Riemann zeta-function *i 

In a classic memoir of 1860 Riemann showed that ques- 
tions concerning the distribution of the primes are intimately 
related to properties of the zeta-function ^ 

£(*)= X l/n\ 

n»! 

where s denotes a complex variable. It is clear that the series 
converges absolutely for <r>l, where s=*<r + it with cr, t real, 1 

and indeed that it converges uniformly for <r>l + 5 for any ! 
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8 > 0. Riemann showed that £(s) can be continued analytically 
throughout the complex plane and that it is regular there except 
for a simple pole at s = 1 with residue 1. He showed moreover 
that it satisfies the functional equation H(s) = E(l-$), where 

S(*)= 7r-*T(i»)i(j). 

The fundamental connection between the zeta-function and | 

the primes is given by the Euler product 

«»)=na-i/pT\ 

p ' 

valid for <r> 1. The relation is readily verified; in fact it is clear 
that, for any positive integer N, 

n ( 1 -i/pT 1 - n u+p-*+p- 2, +- *•)=!>-', 

ps. N ps N m 

where m runs through all the positive integers that are divisible 

only by primes ^ N , and | 

£m~*- X n~* < X n _<r -*0 asN**oo. 

m »«ssN n>N 

The Euler product shows that £( s ) has no zeros for a> 1. In 
view of the functional equation it follows that ((s) has no zeros 
for a < 0 except at the points $ = -2, -4, -6, . . .; these are termed 
the ‘trivial zeros'. All other zeros of £( s ) must lie in the ‘critical 
strip' given by 0^ <r< 1, and Riemann conjectured that they in 
fact lie on the line or - J. This is the famous Riemann hypothesis 
and it remains unproved to this day. There is much evidence in 
favour of the hypothesis; in particular Hardy proved in 1915 \ 

that infinitely many zeros of £(s) lie on the critical line, and , 

extensive computations have verified that at least the first three : 

million zeros above the real axis do so. It has been shown that, i 

if the hypothesis is true, then, for instance, there is a refinement j 

of the prime number theorem to the effect that I 

f* di < 

ir(x) = J j^+0(>/x logx), ! 

and that the difference between consecutive primes satisfies 

Pn+i ~Pn - 0(pj +r ). In fact it has been shown that there is a 

narrow zero-free region for £(s) to the left of the line o = 1, and 

this implies that results as above are indeed valid but with weaker j 

error terms. It is also known that the Riemann hypothesis is j 
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equivalent to the assertion that, for any e > 0, 

I m(») = o<* l"). 

ns* 

The basic relation between the Mdbius function and the 
Riemann zeta-function is given by 

1 /£($)= I M (n)/n*. 

n«l 

This is clearly valid for tr > 1 since the product of the series on 
the right with £ 1/n' is £ v(n)/n'. In fact if the Riemann 
hypothesis holds then the equation remains true for cr > |. There 
is a similar equation for the Euler function, valid for <r > 2, 
namely 

as -l )/C(s)= £ <t>(n)/n\ 

n- 1 

This is readily verified from the result at the end of § 3. Likewise 
there are equations for r(n) and cr(n), valid respectively for cr > 1 
and cr > 2, namely 

(«»))' = I r(n)/n\ f(i)f( S -l)= £ <r(n)/n'. 

n~l n- 1 

9 Further reading 

The elementary arithmetical functions are discussed in 
every introductory text on number theory; again Hardy and 
Wright is a good reference. As regards the last section, the most 
comprehensive work on the subject is that of E. C. Titchmarsh 
The theory of the Riemann zeta-function (Oxford U.P., 1951). 
Other books to be recommended are those of T. M. Apostol 
(Springer-Verlag, Berlin, 1976) and K. Chandrasekharan 
(Springer- Verlag, Berlin, 1968), both with the title Intro- 
duction to analytic number theory see also 
Chandrasekharan’s Arithmetical functions (Springer-Verlag, 
Berlin, 1970). 

10 Exercises 

(i) Evaluate £ d , n n(d)o-(d) in terms of the distinct prime 

factors of n. 
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(ii) Let A(n) = log p if n is a power of a prime p and let 
A(n) = 0 otherwise (A is called von Mangoldt’s 
function). Evaluate £ d ,„ A (d). Express £ A(n)/n* in 
terms of £(s). 

(iii) Let a run through all the integers with 1 is a ^ n and 
(a, n)= 1. Show that /(n) = (l/n)£ a satisfies 
£d|n/(^) = !( n + l)- Hence prove that /(n) = |^(n) for 
n> 1. 

(iv) Let a run through the integers as in (iii). Prove that 
(l/n 3 )£a 3== 4 <Mn)(l+{-l)*P! • • • p k /n 2 ), where 

Pi, . . . , Pk are the distinct prime factors of n (>1). 

(v) Show that the product of all the integers a in (iii) is 
given by n* M l\ J ,Jd\/d d r Md> . 

(vi) Show that £ n5SJt p{n)[x/n]~ 1. Hence prove that 

(vii) Let m, n be positive integers and let d run through S 

all divisors of (m, n). Prove that £ dfi(n/d) = 

fi(n/(m, n))<f>(n)/<f>(n/(m, n)). (The sum here is called 
Ramanujan’s sum.) i 

(viii) Prove that £*., <M»)*7(l-*")==*/(l-*) 2 . (Series of j 

this kind are called Lambert series.) i 

(ix) Prove that £ WSJl 4>( n )/n = ( 6/ir 2 )x + 0(log x). ! 

! 

;t 
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1 Definitions 

Suppose that a , b are integers and that n is a natural 
number. By a » b (mod n) one means n divides b-a; and one 
says that a is congruent to b modulo n. If 0 <.b<n then one 
refers to b as the residue of a (mod n). It is readily verified that 
the congruence relation is an equivalence relation; the 
equivalence classes are called residue classes or congruence 
classes. By a complete set of residues (mod n) one means a set 
of n integers one from each residue class (mod n). 

It is clear that if a * a' (mod n) and b 3 b' (mod n) then fl + fcs 
a' + b’ and a-b^ a' — b* (mod n). Further we have ah = 
a'b' ( mod n), since n divides (a-a')h + a'(h-fe')- Furthermore, 
if f(x) is any polynomial with integer coefficients, then /(a)* 
f(a') (mod n). 

Note also that if ka s ka ' (mod n) for some natural number k 
with (k, n) = 1 then a 53 a' (mod n): thus if a t , . . . , a„ is a com- 
plete set of residues (mod n) then so is ka lt . . . , ka n . More gen- 
erally, if k is any natural number such that ka&ka' (mod n) 
then a 55 a' (mod n/(k, n)), since obviously k/(k, n) and n/(k, n) 
are relatively prime. 

2 Chinese remainder theorem 

Let a, n be natural numbers and let b be any integer. 
We prove first that the linear congruence ax s b (mod n) is 
soluble for some integer x if and only if (a, n) divides b. The 
condition is certainly necessary, for (a, n) divides both a and n. 
To prove the sufficiency, suppose that d = (a, n) divides b. Put 
a' = a/d y b' = bfd and ri = nfd. Then it suffices to solve fl'i a 
b' (mod n'). But this has precisely one solution (mod n')> since 
(a\ n')= 1 and so a'x runs through a complete set of residues 


(mod n') as x runs through such a set. It is clear that if x' is any 
solution of a'x'* b ' (mod n') then the complete set of solutions 
(mod n) of ax * h (mod n) is given by x = x'+ mn\ where m = 
1,2, . . . , d. Hence, when d divides b, the congruence ox® 
b (mod n) has precisely d solutions (mod n). 

It follows from the last result that if p is a prime and if a is 
not divisible by p then the congruence ax — b (mod p) is always 
soluble; in fact there is a unique solution (mod p). This implies 
that the residues 0, 1, . . . , p - 1 form a field under addition and 
multiplication (mod p). It is usual to denote the field by Z p . 

We turn now to simultaneous linear congruences and prove 
the Chinese remainder theorem; the result was apparently known 
to the Chinese at least 1500 years ago. Let nj, . . . , n* be natural 
numbers and suppose that they are coprime in pairs, that is 
(n,,n f ) = 1 for i^j. The theorem asserts that, for any integers 
Ci, .... c*, the congruences x * c i (mod n y ), with 1 < k, are 
soluble simultaneously for some integer x; in fact there is a 
unique solution modulo n = ni • • • n k . For the proof, let m } = 
n/n f (1 < k). Then ( m, , n y )= 1 and thus there is an integer x f 

such that m y x y » c y (mod n } ). Now it is readily seen that x = 
mi*i + * • • satisfies x = c y (modn y ), as required. The 
uniqueness is clear, for if x, y are two solutions then x*» 
y (mod n y ) for 1 < y s k, whence, since the n f are coprime in pairs, 
we have x» {/(mod n). Plainly the Chinese remainder theorem 
together with the first result of this section implies that if 
n i, . . . , n* are coprime in pairs then the congruences a y x* 
b f (mod n y ), with 1 < j ^ k, are soluble simultaneously if and only 
if (a jt nj) divides b f for all j. 

As an example, consider the congruences x® 2 (mod 5), x* 
3 (mod 7), x = 4 (mod 11). In this case a solution is given by 
x = 77xi +55 x 2 + 35x 3 , where x,, x 2 , x 3 satisfy 2xi = 2(mod5), 
6x 2 * 3 (mod 7), 2x 3 * 4 (mod 11). Thus we can take Xj = 1, x 2 = 4, 
x 3 = 2, and these give x = 367. The complete solution is x® 
-18 (mod 385). 

3 The theorems of Fermat and Euler 

First we introduce the concept of a reduced set of 
residues (mod n). By this we mean a set of <j>(n) numbers one 
from each of the 4>(n) residue classes that consist of numbers 
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relatively prime to n. In particular, the numbers a with 1 < a ^ n 
and (a, n) = 1 form a reduced set of residues (mod n). 

We proceed now to establish the multiplicative property of 
<f>, referred to in § 3 of Chapter 2, using the above concept. 
Accordingly let n, n' be natural numbers with (n, n') = 1. Further j 

let a and a' run through reduced sets of residues (mod n) and 
(mod n') respectively. Then it suffices to prove that an’+a’n runs 
through a reduced set of residues (mod nn'); for this implies that 
<£(n)<£(n') = <fi(nn f ), as required. Now clearly, since (a, n) = 1 and 
(a\ n') = 1, the number an’ + a'n is relatively prime to n and to 
n' and so to tin’. Furthermore any two distinct numbers of the 
form are incongruent (mod nn'). Thus we have only to prove 
that if ( b , nn')= 1 then b = an' + a'n (mod nn') for some a, o' as 
above. But since (n, n') = 1 there exist integers m, m' satisfying 
mn'+m'n = 1. Plainly (bm, n) = 1 and so a s bm (mod n) for 
some a; similary o'« bm' (mod n') for some o', and now it is 
easily seen that o, o' have the required property. 

Fermat’s theorem states that if o is any natural number and 
if p is any prime then a p ™ a (mod p). In particular, if (a, p) = 1, 
then a pl » 1 (mod p). The theorem was announced by Fermat 
in 1640 but without proof. Euler gave the first demonstration 
about a century later and, in 1760, he established a more general 
result to the effect that, if o, n are natural numbers with (a, n) = 1, 
then a* (n) sB i ( m od n). For the proof of Euler’s theorem, we 
observe simply that as x runs through a reduced set of residues 
(mod n) so also ax runs through such a set. Hence fl ( ax ) s 
[] (x) (mod n), where the products are taken over all x in the 
reduced set, and the theorem follows on cancelling f] (x) from 
both sides. 

4 Wilson’s theorem 

This asserts that (p-l)! = -l (mod p) for any prime p. 

Though the result is attributed to Wilson, the statement was 
apparently first published by Waring in his Meditationes alge- 
braicae of 1770 and a proof was furnished a little later by 
Lagrange. 

For the demonstration, it suffices to assume that p is odd. Now 
to every integer a with 0<a<p there is a unique integer a' 


with 0< a'< p such that aa'= 1 (mod p). Further, if a - a' then 
a 2 ss 1 (mod p) whence a - 1 or a-p- 1. Thus the set 
2, 3, .... p - 2 can be divided into |( p - 3) pairs a, a ' with aa’ = 
1 (mod p). Hence we have 2 • 3 * * * (p-2)= 1 (mod p), and so 
( p - 1 )! ■ p - 1 * - 1 (mod p), as required. 

Wilson’s theorem admits a converse and so yields a criterion 
for primes. Indeed an integer n>l is a prime if and only if 
(n-l)!s-l(modn). To verify the sufficiency note that any 
divisor of n, other than itself, must divide (n- 1)!. 

As an immediate deduction from Wilson’s theorem we see that 
if p is a prime with p s l(mod4) then the congruence x 2 ^ 
-1 (mod p) has solutions x = ±(r!), where r = £(p-l). This fol- 
lows on replacing a + r in ( p — l)f by the congruent integer 
a - r— 1 for each a with 1 < o < r. Note that the congruence has 
no solutions when p^ 3 (mod 4), for otherwise we would have 
x p_1 = x 2r * (-l) r = -1 (mod p) contrary to Fermat’s theorem. 

5 Lagrange’s theorem 

Let /(x) be a polynomial with integer coefficients and 
with degree n. Suppose that p is a prime and that the leading 
coefficient of /, that is the coefficient of x", is not divisible by 
p. Lagrange’s theorem states that the congruence /( x ) = 0 (mod p) 
has at most n solutions (mod p). 

The theorem certainly holds for n = 1 by the first result in § 2. 
We assume that it is valid for polynomials with degree n - 1 and 
proceed inductively to prove the theorem for polynomials with 
degree n. Now, for any integer a we have /(x)- /(a) = (x — a)g(x), 
where g is a polynomial with degree n — 1, with integer 
coefficients and with the same leading coefficient as /. Thus if 
/(x) = 0(modp) has a solution x = a then all solutions of the 
congruence satisfy (x - a)g(x) = 0 (mod p). But, by the inductive 
hypothesis, the congruence g(x)^0(mod p) has at most n-1 
solutions (mod p). The theorem follows. It is customary to write 
f(x)— g(x) (mod p) to signify that the coefficients of like powers 
of x in the polynomials /, g are congruent (mod p); and it is 
clear that if the congruence f(x) ■ 0 (mod p) has its full comple- 
ment flj, . . . , a„ of solutions (mod p) then 
/(x)s»c(x-a,) • • • (x - o„)(mod p), 
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where c is the leading coefficient of /. In particular, by Fermat's 
theorem, we have 

x p_, -l = (x-l) • • • (x-p + l)(mod p), 1 

and, on comparing constant coefficients, we obtain another proof 
of Wilson's theorem. 

Plainly, instead of speaking of congruences, we can express ; 

the above succinctly in terms of polynomials defined over Z p . 

Thus Lagrange’s theorem asserts that the number of zeros in Z p 
of a polynomial defined over this field cannot exceed its degree. 

As a corollary we deduce that, if d divides p — 1 then the poly- 
nomial x d - 1 has precisely d zeros in Z p . For we have x p ~ l — 1 = 

{x d -\ )g(x), where g has degree p-l-d. But, by Fermat's 
theorem, x p ~ l - 1 has p- 1 zeros in Z p and so x d - 1 has at least 
(p-l) — (p-l-d) = d zeros in Z p , whence the assertion. 

Lagrange’s theorem does not remain true for composite 1 

moduli. In fact it is readily verified from the Chinese remainder 
theorem that if m lf . . . , m* are natural numbers coprime in pairs, 
if f(x) is a polynomial with integer coefficients, and if the 
congruence f(x) 53 0 (mod m f ) has s t solutions (mod m f ), then the 
congruence /(x) = 0 (mod m), where m = m, • • • m k , has s = 
s x • • • s* solutions (mod m). Lagrange’s theorem is still false for 
prime power moduli; for example x z =l(mod8) has four sol- 
utions. But if the prime p does not divide the discriminant of / 
then the theorem holds for all powers p J ; indeed the number of 
solutions of /(x)s=0 (mod p J ) is, in this case, the same as the 
number of solutions of f(x) = 0 (mod p). This can be seen at once 
when, for instance, f(x) — x 2 -a\ for if p is any odd prime that 
does not divide a, then from a solution y of f(y) » 0 (mod p *) we 
obtain a solution x — y + p*z of /(x)=® 0 (mod p i+x ) by solving the 1 

congruence 2yz+ f(y)/p J **0 (mod p) for z, as is possible since 
(2y, p) = 1. 

t 

6 Primitive roots 

Let a, n be natural numbers with (a, n)= 1 . The least 
natural number d such that a d * 1 (mod n) is called the order . 

of a (mod n), and a is said to belong to d (mod n). By Euler’s 1 

theorem, the order d exists and it divides 4>(n). In fact d divides 
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every integer k such that a* = 1 (mod n), for, by the division 
algorithm, k = dq + r with 0 < r<d, whence a r ™ 1 (mod n) and 
so r = 0. 

By a primitive root (mod n) we mean a number that belongs 
to <£(n)(mod n). We proceed to prove that for every odd prime 
p there exist 4>(p~ 1) primitive roots (mod p). Now each of the 
numbers 1,2, ...,p-l belongs (mod p) to some divisor d of 
p— 1; let iff(d) be the number that belongs to d (mod p) so that 

I 4>(d)=p~\. 

It will suffice to prove that if tff(d) ^ 0 then if/(d) = <f>(d). For, by 
§ 3 of Chapter 2, we have 

I <f>(d) - p — \, 

rfkp-i> 

whence ty(d) ^0 for all d and so iff(p— 1) = <f>( p— 1) as required. 

To verify the assertion concerning iff, suppose that if/(d)7 6 0 
and let a be a number that belongs to d (mod p). Then 

a, a 2 a d are mutually incongruent solutions of x d ** 

1 (mod p) and thus, by Lagrange’s theorem, they represent all 
the solutions (in fact we showed in § 5 that the congruence has 
precisely d solutions (mod p)). It is now easily seen that the 
numbers a m with 1 <m<d and (m,d)= 1 represent all the 
numbers that belong to d (mod p); indeed each has order d, for 
if a md * 1 then d\d', and if b is any number that belongs to 
d (mod p) then b m a m for some m with 1 < m ^ d, and we have 
(m, d)-\ since b d/im - d) * (a d ) m/(m ' d) m 1 (mod p). This gives 
ifr(d) — <f>(d), as asserted. 

Let g be a primitive root (mod p). We prove now that there 
exists an integer x such that g’ = g + px is a primitive root (mod p f ) 
for all prime powers p *. We have g p ~ x = 1 4- py for some integer 
y and so, by the binomial theorem, g ,p ~ x = 1 + pz, where 
z s t/ + ( p- \)g p ~ 2 x (mod p). 

The coefficient of x is not divisible by p and so we can choose 
x such that (z, p)= 1. Then g’ has the required property. For 
suppose that g’ belongs to d (mod p j ). Then d divides <f>( p*) = 
p f ~ x (p- 1). But g' is a primitive root (mod p) and thus p-1 
divides d. Hence d = p k (p- 1) for some k<j. Further, since p 



24 Congruences 

is odd, we have 

a+pz)» k =i+ p k+i z k , 

where (z k , p) = 1. Now since g' d 85 1 (mod p 1 ) it follows that j = 1 

k + 1 and this gives d = <f>( p j ), as required. 

Finally we deduce that, for any natural number n, there exists 
a primitive root (mod n) if and only if n has the form 2, 4, p J 
or 2 p\ where p is an odd prime. Clearly 1 and 3 are primitive 
roots (mod 2) and (mod 4). Further, if g is a primitive root 
(mod p J ) then the odd element of the pair g, g + p* is a primitive 
root (mod 2 p J ), since (f>(2p i ) = p J ). Hence it remains only to 
prove the necessity of the assertion. Now if n = nin 2 , where 
(ni, n 2 )=l and nj>2, n 2 > 2, then there is no primitive root 
(mod n). For <£(n,) and <f>(n 2 ) are even and thus for any natural 
number a we have 

a ^(n) == ( a ^(n,))^(n 8 ) Eg j (mod „ i); ( 

similarly a i<k(n) ^ 1 (mod n 2 ), whence a** (rt) = 1 (mod n). Further, 
there are no primitive roots (mod 2 1 ) for j > 2, since, by induction, 
we have a 2 ' * 1 (mod 2 y ) for all odd numbers a. This proves 
the theorem. 

7 Indices 

Let g be a primitive root (mod n). The numbers g l with 
l = 0, 1, . . . , <f>(n)- 1 form a reduced set of residues (mod n). * 

Hence, for every integer a with (a, n)=l there is a unique l 
such that g l s a (mod n). The exponent l is called the index of 
a with respect to g and it is denoted by ind a. Plainly we have 

ind a + ind b = ind ( ab ) (mod <f>(n)) f 

and ind 1 = 0, indg = l. Further, for every natural number m, 
we have ind (a m ) = m ind a (mod <f>(n)). These properties of the 
index are clearly analogous to the properties of logarithms. We 
also have ind(-l) = §<£(n) for n>2 since g 2 ln<l<-1) sl(mod n) j 

and 2 ind (— l)<2<^(n). 

As an example of the use of indices, consider the congruence 
x n ** a (mod p), where p is a prime. We have n ind x ■ 
ind a (mod ( p - 1)) and thus if (n, p- 1) = 1 then there is just one 1 

solution. Consider, in particular, % 5 = 2 (mod 7). It is readily 
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verified that 3 is a primitive root (mod 7) and we have 3 2 “ 
2 (mod 7). Thus 5 ind x « 2 (mod 6), which gives ind x = 4 and 
i h 3 4 *4 (mod 7). 

Npte that although there is no primitive root (mod 2 j ) for j > 2, 
the number 5 belongs to 2 /-2 (mod 2*) and every odd integer a 
is congruent (mod 2*) to just one integer of the form (-l) , 5 m , 
where / = 0, 1 and m = 0, 1, . . . , 2*~ l . The pair l, m has similar 
properties to the index defined above. 

8 Further reading 

A good account of the elementary theory of congruences 
is given by T. Nagell, Introduction to number theory (Wiley, 
New York, 1951); this contains, in particular, a table of primitive 
roots. There is another, and in fact more extensive table in I. M. 
Vinogradov’s An introduction to the theory of numbers (Per- 
gamon Press, Oxford, London, New York, Paris, 1961). Again 
Hardy and Wright cover the subject well. 

9 Exercises 

(i) Find an integer x such that 2x * 1 (mod 3), 3x * 

1 (mod 5), 5x * 1 (mod 7). 

(ii) Prove that for any positive integers a , n with (a, n) = 

1, £ {ox/n} = |0(n), where the summation is over all x 
in a reduced set of residues (mod n). 

(iii) The integers a and n > 1 satisfy a" -1 * 1 (mod n) but 
a m # 1 (mod n) for each divisor m of n - 1, other than 
itself. Prove that n is a prime. 

(iv) Show that the congruence x p ~ l -1=0 (mod p y ) has 
just p-1 solutions (mod p *) for every prime power p\ 

(v) Prove that, for every natural number n, either there is 
no primitive root (mod n) or there are <f>(<f>(n)) 
primitive roots (mod n). 

(vi) Prove that, for any prime p, the sum of all the distinct 
primitive roots (mod p) is congruent to p(p-l) 

(mod p). 
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(vii) Determine all the solutions of the congruence j/ 2 ** 

5x 3 (mod 7) in integers x, y. 

(viii) Prove that if p is a prime >3 then the numerator of i 

1 + ! + • • • 4 - l/(p- 1) is divisible by p 2 (Wolstenholme’s 
theorem). 


\ 


\ 


1 
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Quadratic residues 


1 Legendre’s symbol 

In the last chapter we discussed the linear congruence 
ax**b (mod n). Here we shall study the quadratic congruence 
x 2 = a (mod n); in fact this amounts to the study of the general 
quadratic congruence ax 2 +bx + c = 0 (mod n), since on writing 
d-b 2 - Aac and y ~ 2 ax + b, the latter gives y 2 ** d (mod Aan). 

Let a be any integer, let n be a natural number and suppose 
that (a, n)= 1. Then a is called a quadratic residue (mod n) if 
the congruence x 2 »a (mod n) is soluble; otherwise it is called 

a quadratic non-residue (mod n). The Legendre symbol 

where p is a prime and (a, p) = 1, is defined as 1 if a is a quadratic 
residue (mod p) and as -1 if a is a quadratic non-residue (mod p). 
Clearly, if a * a’ (mod p), we have 

(;)-©■ 

2 Euler’s criterion 

This states that if p is an odd prime then 

as a* <p-,) (mod p). 

For the proof we write, for brevity, r = |(p-l) and we note 
first that if a is a quadratic residue (mod p) then for some x in 
N we have x 2 *a (mod p), whence, by Fermat’s theorem, a r ® 
x p_1 « 1 (mod p). Thus it suffices to show that if a is a quadratic 
non-residue (mod p) then a r * -1 (mod p). Now in any reduced 
set of residues (mod p) there are r quadratic residues (mod p) 
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and r quadratic non-residues (mod p); for the numbers 
l 2 , 2 2 , ... f r 2 are mutually incongruent (mod p) and since, for 
any integer k, (p- k) 2 ** k z ( mod p), the numbers represent all 
the quadratic residues (mod p). Each of the numbers satisfies 
x r 53 1 (mod p), and, by Lagrange’s theorem, the congruence has 
at most r solutions (mod p). Hence if a is a quadratic non-residue 
(mod p) then a is not a solution of the congruence. But, by 
Fermat's theorem, a p ~ l * 1 (mod p), whence a r “±l(modp). 
The required result follows. Note that one can argue alternatively 
in terms of a primitive root (mod p), say g ; indeed it is clear that 
the quadratic residues (mod p) are given by 1 , g 2 , . . . , g 2r . 

As an immediate corollary to Euler’s criterion we have the 
multiplicative property of the Legendre symbol, namely 

for all integers a, b not divisible by p; here equality holds since 
both sides are ±1. Similarly we have 

(7) 

in other words, -1 is a quadratic residue of all primes * 1 (mod 4) 
and a quadratic non-residue of all primes =3 (mod 4). It will be 
recalled from § 4 of Chapter 3 that when p s I (mod 4) the 
solutions of x 2 B — 1 (mod p) are given by x = ±(rl). 

3 Gauss’ lemma 

For any integer a and any natural number n we define 
the numerically least residue of a (mod n) as that integer a ' for 
which a^a' (mod n) and -|n < a'< gn. 

Let now p be an odd prime and suppose that (a, p) = 1. Further 
let Of be the numerically least residue of aj (mod p) for j = 
1,2 Then Gauss’ lemma states that 



where l is the number of / < ^( p — 1 ) for which a } < 0. 

For the proof we observe that the numbers \a } \ with 1 r, 
where r = |(p- 1), are simply the numbers 1, 2, . . . , r in some 
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order. For certainly we have 1 ^ r, and the \a f \ are distinct 
since £^ = - 0 *, with k^r, would give a(j + k)*s 0(mod p) with 
0<j+k<p , which is impossible, and = gives aj «* 
ak (mod p), whence j=k. Hence we have a\ — a r = (-l)Vl. 
But a } ® aj (mod p) and so a t • • • a r * a f r\ (mod p). Thus a r » 
(-1)* (mod p), and the result now follows from Euler’s criterion. 

As a corollary we obtain 




that is, 2 is a quadratic residue of all primes 353 ±1 (mod 8) and 
a quadratic non-residue of all primes » ±3 (mod 8). To verify 
this result, note that, when a- 2, we have a, = 2 j for 1 s j < [|p] 
and Oj = 2j-p for l 4 p]<J^I(p-l). Hence in this case / = 
g( p - 1 ) - [| p], and it is readily checked that / 35 g( p 2 - 1 ) (mod 2). 

4 Law of quadratic reciprocity 

We come now to the famous theorem stated by Euler in 
1783 and first proved by Gauss in 1796. Apparently Euler, 
Legendre and Gauss each discovered the theorem independently 
and Gauss worked on it intensively for a year before establishing 
the result; he subsequently gave no fewer than eight demonstra- 
tions. 

The law of quadratic reciprocity asserts that if p, q are distinct 
odd primes then 


(?)(-:) 


Thus if p, q are not both congruent to 3 (mod 4) then 


©-(;)• 


and in the exceptional case 


(;)-(;)• 




For the proof we observe that, by Gauss' lemma 


where l is the number of lattice points (x, y) (that is, pairs of 
integers) satisfying 0 <x<\q and -\q<px-qy< 0. Now these 
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inequalities give t/<(px/q)+|<g(p + l). Hence, since y is an 
integer, we see that l is the number of lattice points in the 
rectangle R defined by 0<x<|q, 0 <y<\p, satisfying -\q< 
px-qy< 0 (see Fig. 4.1). Similarly 

where m is the number of lattice points in R satisfying -g p < 
qy-px<0. Now it suffices to prove that |(p-l)(q-l)-(/ + m) 
is even. But j( p — \){q - 1) is just the number of lattice points in 
R, and thus the latter expression is the number of lattice points 
in i? satisfying either px — qry < -gq or qy-px^ -g p. The regions 
in R defined by these inequalities are disjoint and they contain 
the same number of lattice points since, as is readily verified, 
the substitution 

x=kq + l)-x\y = &p+l)-y' 



Fig. 4.1. The rectangle R in the proof of the law of 
quadratic reciprocity. 
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furnishes a one-one correspondence between them. The theorem 
follows. 

The law of quadratic reciprocity is useful in the calculation 
of Legendre symbols. For example, we have 

Further, for instance, we obtain 



whence -3 is a quadratic residue of all primes 55 1 (mod 6) and 
a quadratic non-residue of all primes 1 (mod 6). 

5 Jacobi’s symbol 

This is a generalization of the Legendre symbol. Let n 
be a positive odd integer and suppose that n = p t p 2 * * * Pk as a 
product of primes, not necessarily distinct. Then, for any integer 
a with (a, n) = 1, the Jacobi symbol is defined by 



where the factors on the right are Legendre symbols. When n = 1 
the Jacobi symbol is defined as 1 and when (a, n) > 1 it is defined 
as 0. Clearly, if a ■ a' (mod n) then 



It should be noted at once that 



does not imply that a is a quadratic residue (mod n). Indeed a 
is a quadratic residue (mod n) if and only if a is a quadratic 
residue (mod p) for each prime divisor p of n (see § 5 of Chapter 
3). But 



does imply that a is a quadratic non-residue (mod n). Thus, for 


i 
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example, since 

we conclude that 6 is a quadratic non-residue (mod 35). 

The Jacobi symbol is multiplicative, like the Legendre sym- 
bol; that is 



for all integers a, b relatively prime to n. Further, if m, n are 
odd and (a, mn ) - 1 then 



Furthermore we have 



and the analogue of the law of quadratic reciprocity holds, 
namely if m, n are odd and (m, n)= 1 then 

These results are readily verified from the corresponding 
theorems for the Legendre symbol, on noting that, if n = n t n 2 , 
then 

g(n -l)s |(n, - 1) +i(r» 2 - 1) (mod 2), 
since l(n t - l)(na - 1 ) m 0 (mod 2), and that a similar congruence 
holds for g(n 2 - 1). 

Jacobi symbols can be used to facilitate the calculation of 
Legendre symbols. We have, for example, 

/ 335 \ = /2999\ = /-16\ = _(zi\ = 

V2999/ \ 335 / \335/ \335/ 

whence, since 2999 is a prime, it follows that 335 is a quadratic 
residue (mod 2999). 

6 Further reading 

The theories here date back to the Disquisitiones arith- 
meticae of Gauss, and they are covered by numerous texts. An 
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excellent account of the history relating to the law of quadratic 
reciprocity is given by Bachmann, Niedere Zahlentheorie (Teub- 
ner, Leipzig, 1902), Vol. 1. In particular he gives references to 
som^ forty different proofs. For an account of modern develop- 
ments associated with the law of quadratic reciprocity see Artin 
and Tate, Class field theory (W.A. Benjamin Inc., New York, 
1967) and Cassels and Frdhlich (Editors) Algebraic number 
theory (Academic Press, London, 1967). 

The study of higher congruences, that is congruences of the 
form /(X|, . . . , x„)«»0(mod p f ), where / is a polynomial with 
integer coefficients, leads to the concept of p-adic numbers and 
to deep theories in the realm of algebraic geometry; see, for 
example, Borevich and Shafarevich, Number theory (Academic 
Press, London, 1966), and Weil, ‘Numbers of solutions of 
equations in finite fields’. Bull. American Math. Soc. 55 (1949), 
497-508. 

7 Exercises 

(i) Determine the primes p for which 5 is a quadratic 
residue (mod p). 

(ii) Show that if p is a prime 33( mo d4) and if p' = 2p+l 
is a prime then 2 p sl ( mo d p'). Deduce that 2 251 - 1 is 
not a Mersenne prime. 

(iii) Show that if p is an odd prime then the product P of 
all the quadratic residues (mod p) satisfies P* 
(-l) i(p+,) (mod p). 

(iv) Prove that if p is a prime =1 (mod 4) then £ r = 

$p(p- 1), where the summation is over all quadratic 
residues r with 1 < r< p- 1. 

(v) Evaluate the Jacobi symbol 

(vi) Show that, for any integer d and any odd prime p, the 

number of solutions of the congruence (mod p) 

isi+ 6)- 
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(vii) Let f(x) = ax 2 +bx + c, where a , b, c are integers, and 
let p be an odd prime that does not divide a. Further 
let d = h 2 -4ac. Show that, if p does not divide d, 
then 



Evaluate the sum when p divides d. 

(viii) Prove that if p' is a prime * 1 (mod 4) and if p = 

2p' + 1 is a prime then 2 is a primitive root (mod p). 
For which primes p' with p = 2p' + l prime is 5 a 
primitive root (mod p)? 

(ix) Show that if p is a prime and a, b, c are integers not 
divisible by p then there are integers x, y such that 
ax 2 + by 2 ** c (mod p). 

(x) Let / = /(*|, . . . , x n ) be a polynomial with integer 
coefficients that vanishes at the origin and let p be a 
prime. Prove that if the congruence 0 (mod p) has 
only the trivial solution then the polynomial 

l_ / p- , _(l™* r l )...(l_* r , ) 

is divisible by p for all integers x u . . . , x n . Deduce 
that if / has total degree less than n then the 
congruence / s O(mod p) has a non-trivial solution 
(Chevalley’s theorem). 

(xi) Prove that if f=f(x u . . . , x n ) is a quadratic form with 
integer coefficients, if ns 3, and if p is a prime then 
the congruence /»0(mod p) has a non-trivial 
solution. 


I 
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Quadratic forms 


1 Equivalence 

We shall consider binary quadratic forms 
f(x, y) = ax 2 + bxy + cy 2 , 

where a, b, c are integers. By the discriminant of / we mean the 
number d = b 2 —Aac. Plainly rf* 0 (mod 4) if b is even and d * 
1 (mod 4) if b is odd. The forms x 2 -\dy 2 for d ®0(mod 4) and 
x 2 + xy + \(l- d)y 2 for d s l(mod4) are called the principal 
forms with discriminant d. We have 

Aafix, y) = (2 ax + by) 2 - dy 2 , 

whence if d <0 the values taken by / are all of the same sign 
(or zero); / is called positive or negative definite accordingly. If 
d > 0 then / takes values of both signs and it is called indefinite. 

We say that two quadratic forms are equivalent if one can be 
transformed into the other by an integral unimodular substitu- 
tion, that is, a substitution of the form 

x = px’+qy\ y = rx' + sy', 

where p, q, r, s are integers with ps - qr = 1. It is readily verified 
that this relation is reflexive, symmetric and transitive. Further, 
it is clear that the set of values assumed by equivalent forms as 
x, y run through the integers are the same, and indeed they 
assume the same set of values as the pair x, y runs through all 
relatively prime integers; for (x, y)- 1 if and only if (x\ y')-\. 
Furthermore equivalent forms have the same discriminant. For 
the substitution takes / into 

f(x', V ') = a'*' 2 +i>'*v+c>". 



) 
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where 


a'-fip,r), b' = 2apq + b(ps + qr)+2crs, 
c' = fiq,s), 

and it is readily checked that b' 2 - 4a’ c’ = dips — qr) 2 . Alterna- 
tively, in matrix notation, we can write / as X T FX and the 
substitution as X = UX', where 


*-(;)• -fr) -te ?)• -c :)■ 


then / is transformed into X' T F'X\ where F ' = U T FU, and, since 
the determinant of U is 1, it follows that the determinants of F 
and F' are equal. 


2 Reduction 

There is an elegant theory of reduction relating to posi- 
tive definite quadratic forms which we shall now describe. 
Accordingly we shall assume henceforth that d< 0 and that 
a > 0; then we have also c > 0. 

We begin by observing that by a finite sequence of unimodular 
substitutions of the form x - y', y--x' and x * x'± y\ y= y', f 
can be transformed into another binary form for which |h| < a =£ 
c. For the first of these substitutions interchanges a and c whence 
it allows one to replace a > c by a < c; and the second has the 
effect of changing b to b±2a, leaving a unchanged, whence, 
by finitely many applications it allows one to replace \b\> a by 
|h|<a. The process must terminate since, whenever the first 
substitution is applied it results in a smaller value of a. In fact 
we can transform / into a binary form for which either 
-a < b < a < c or 0 <h<o = c. 

For if b = —a then the second of the above substitutions allows 
one to take b = a, leaving c unchanged, and if a = c then the 
first substitution allows one to take 0 ^ b. A binary form for 
which one or other of the above conditions on a, b , c holds is 
said to be reduced. 

There are only finitely many reduced forms with a given 
discriminant d\ for if / is reduced then —d = 4ac — b 2 ^3ac, 
whence a, c and \b\ cannot exceed g|d|. The number of reduced 
forms with discriminant d is called the class number and it is 
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denoted by h(d). To calculate the class number when d — —4, 
for example, we note that the inequality 3 ac 4 gives a - c = 1, 
whence b = 0 and h(- 4)= 1. The number h(d) is actually the 
number of inequivalent classes of binary quadratic forms with 
discriminant d since, as we shall now prove, any two reduced 
forms are not equivalent. 

Let f(x , y) be a reduced form. Then if x, y are non-zero integers 
and |x|s|y| we have 

fix, y)^\x\{a\x\-\by\) + c\y\ 2 

s |x| 2 (a - |h|) + c\ y\ 2 ss a - |h| + c. 

Similarly if |y| as |x| we have f(x, y) 2 : a - |h| + c. Hence the smal- 
lest values assumed by / for relatively prime integers x, y are a, 
c and a-\b\ + c in that order; these values are taken at (1,0), 
(0, 1) and either (1,1) or (1,-1). Now the sequences of values 
assumed by equivalent forms for relatively prime x, y are the 
same, except for a rearrangement, and thus if f is a form, as in 
§ 1, equivalent to f and if also f is reduced, then a - a’, c- c' 
and b = ±b'. It remains therefore to prove that if b = -b' then 
in fact b = 0. We can assume here that - a<b<a<c , for, since 
f is reduced, we have - a<-b , and if a - c then we have h as 0, 
-bzt 0, whence b- 0. It follows that fix, y)&a-\b\ + c> c> a 
for all non-zero integers x, y. But, with the notation of § 1 for 
the substitution taking / to /', we have a =/( p, r). Thus p-± 1, 
r = 0, and from ps — qr = l we obtain $ = ±1. Further we have 
c = fiq,s ) whence q = 0. Hence the only substitutions taking / 
to f are x-x\ y- y' and x = -x', y = -y'. These give b = 0, as 
required. 

3 Representations by binary forms 

A number n is said to be properly represented by a binary 
form / if n = fix, y) for some integers x, y with (x, y) = 1. There 
is a useful criterion in connection with such representations, 
namely n is properly represented by some binary form with 
discriminant d if and only if the congruence x 2 = d (mod 4n) is 
soluble. 

For the proof, suppose first that the congruence is soluble and 
let x = b be a solution. Define c by b 2 -4nc - d and put a = n. 
Then the form /, as in § 1, has discriminant d and it properly 


» 
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s' 

j 


represents n, in fact /(l,0) = n. Conversely suppose that / has 
discriminant d and that n =/(p, r) for some integers p, r with 
(p, r) = 1. Then there exist integers r, s with ps-qr = 1 and / is 
equivalent to a form /' as in § 1 with a' = ru But / and f have 
the same discriminants and so b' 2 -4nc' = d. Hence the con- 
gruence x 2 ^ d (mod 4 n) has a solution x — b'. 

The ideas here can be developed to furnish, in the case (n, d) = 
1, the number of proper representations of n by all reduced 
forms with a given discriminant d. Indeed the quantity in ques- 
tion is given by ws , where s is the number of solutions of the 
congruence x 2 = d (mod 4n) with 0 ^ x < 2n and w is the number 
of automorphs of a reduced form; by an automorph of / we 
mean an integral unimodular substitution that takes / into itself. 
The number w is related to the solutions of the Pell equation 
(see § 3 of Chapter 7); it is given by 2 for d < —4, by 4 for d — — 4 
and by 6 for d = -3. In fact the only automorphs, for d < -4, 
are x = x', y=y' and x = -x', y = -y'. 

4 Sums of two squares 

Let n be a natural number. We proceed to prove that n 
can be expressed in the form x 2 + y 2 for some integers x, y if and 
only if every prime divisor p of n with p = 3(mod4) occurs to 
an even power in the standard factorization of n. The result dates 
back to Fermat and Euler. 

The necessity is easily verified, for suppose that n = x 2 +y 2 
and that n is divisible by a prime p = 3(mod4). Then x 2 = 
-t/ 2 (mod p) and since —1 is a quadratic non-residue (mod p), 
we see that p divides x and y. Thus we have (x/p) 2 +(y/p) 2 = 
n/p 2 , and it follows by induction that p divides n to an even 
power. 

To prove the converse it will suffice to assume that n is square 
free and to show that if each odd prime divisor p of n satisfies 
p=l (mod 4) then n can be represented by x 2 +y 2 -, for clearly 
if n = x 2 +y 2 then nm 2 = (xm) 2 + (t/m) 2 . Now the quadratic form 
x 2 + y 2 is a reduced form with discriminant —4, and it was proved 
in § 2 that h(-4)~ 1. Hence it is the only such reduced form. It 
follows from § 3 that n is properly represented by x 2 +y 2 if and 
only if the congruence x 2 = -4 (mod 4n) is soluble. But, by 


1 
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hypothesis, —1 is a quadratic residue (mod p) for each prime 
divisor p of n. Hence -1 is a quadratic residue (mod n) and the 
result follows. 

It, will be noted that the argument involves the Chinese 
remainder theorem; but this can be avoided by appeal to the 
identity 

(l 2 + V*)(x* + y' 2 ) = (xx' + yyf + (*(,•- yx’f 

which enables one to consider only prime values of n. In fact 
there is a well known proof of the theorem based on this identity 
alone, similar to 8 5 below. 

The demonstration here can be refined to furnish the number 
of representations of n as x 2 +y 2 . The number is given by 

4 X where the summation is over all odd divisors m of n. 

Thus, for instance, each prime p a 1 (mod 4) can be expressed 
in precisely eight ways as the sum of two squares. 

5 Sums of four squares 

We prove now the famous theorem stated by Bachet in 
1621 and first demonstrated by Lagrange in 1770 to the effect 
that every natural number can be expressed as the sum of four 
integer squares. Our proof will be based on the identity 

(x 2 + y 2 + z 2 + u; 2 )(x' 2 +«/' 2 +z' 2 + w' z ) 

= (xx'+ yt/'+ zz' + ww') 2 + (xy' - yx'+ wz'-zw') 2 
+(xz'- zx'+yw'- wy') 2 + (xw'- tvx' + zyf - yz ') 2 , 
which is related to the theory of quaternions. 

In view of the identity and the trivial representation 2 = 

1 2 4- 1 2 + 0 2 + 0 2 , it will suffice to prove the theorem for odd primes 
p. Now the numbers x 2 with 0^x<|(p-l) are mutually incon- 
gruent (mod p), and the same holds for the numbers -1 - y 2 with 
0 — t/^Kp-l). Thus we have x 2 = - 1 - y 2 (mod p) for some x, y 
satisfying x 2 + y 2 + 1 < 1 + 2(§p) 2 < p 2 . Hence we obtain mp = 
x 2 +y 2 + 1 for some integer m with 0 < m < p. 

Let / be the least positive integer such that Ip = x 2 + e/ 2 + z 2 + w 2 
for some integers x, y , z, w. Then / < m < p. Further / is odd, for 
if l were even then an even number of x, y , z, w would be odd 
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and we could assume that x + y, x-y , z + w , z-w are even; but 
kb = (£(* + y)f + (&* - y)f + (£(z + u>)) 2 +(k(*~ w)) 2 

1 

and this is inconsistent with the minimal choice of L To prove 
the theorem we have to show that I = 1; accordingly we suppose 
that l > 1 and obtain a contradiction. Let x\ y\ z\ w' be the 
numerically least residues of x, y t z, w (mod /) and put J 

n- x ,2 +y' 2 + z' 2 +w' 2 . 

j 

Then n =* 0 (mod /) and we have n > 0, for otherwise l would 

divide p. Further, since l is odd, we have n<4(kl) 2=s l 2 . Thus * 

n = kl for some integer k with 0 <k<l. Now by the identity we 

see that ( kl){lp ) is expressible as a sum of four integer squares, 

and moreover it is clear that each of these squares is divisible 

by l 2 . Thus kp is expressible as a sum of four integer squares. 

But this contradicts the definition of l and the theorem follows. i 

The argument here is an illustration of Fermat’s method of 
infinite descent. 

There is a result dating back to Legendre and Gauss to the 
effect that a natural number is the sum of three squares if and 1 

only if it is not of the form 4 i (8fc + 7) with j, k non-negative 
integers. Here the necessity is obvious since a square is congruent 
to 0, 1 or 4 (mod 8) but the sufficiency depends on the theory of 
ternary quadratic forms. 

Waring conjectured in 1770 that every natural number can be ‘ 

represented as the sum of four squares, nine cubes, nineteen 
biquadrates ‘and so on’. One interprets the latter to mean that, 
for every integer k^: 2 there exists an integer s — s(k ) such that 
every natural number n can be expressed in the form x k + • • • + 

x s k with Xj x, non-negative integers; and it is customary to \ 

denote the least such s by g(k). Thus we have g( 2) = 4. Waring’s 
conjecture was proved by Hilbert in 1909. Another, quite differ- 
ent proof was given by Hardy and Littlewood in 1920 and it ( 

was here that they described for the first time their famous ‘circle 
method’. The work depends on the identity 

I r (n)z n = (f(z)Y, 

n-0 ' 

where r(n) denotes the number of representations of n in the 


required form and f(z) = 1 + z 1 * + z 2 * + • • •. Thus we have 




(f(z)Y 


dz 


for a suitable contour C. The argument now involves a delicate 
division of the contour into ‘major and minor' arcs, and the 
analysis leads to an asymptotic expression for r(n) and to precise 
estimates for g(k). 


6 Further reading 

A careful account of the theory of binary quadratic forms 
is given in Landau, Elementary number theory (Chelsea Publ. 
Co., New York, 1966); see also Davenport, The higher arithmetic 
(Cambridge U.P., 5th edn, 1982). As there, we have used the 
classical definition of equivalence in terms of substitutions with 
determinant 1; however, there is an analogous theory involving 
substitutions with determinant ±1 and this is described in Niven 
and Zuckerman, An introduction to the theory of numbers 
(Wiley, New York, 4th edn, 1980). 

For a comprehensive account of the general theory of quad- 
ratic forms see Cassels, Rational quadratic forms (Academic 
Press, London and New York, 1978). For an account of the 
analysis appertaining to Waring’s problem see R. C. Vaughan, 
The Hardy-Littlewood method (Cambridge U.P., 1981). 

7 Exercises 

(i) Prove that h(d) = 1 when d = -3, -4, -7, -8, -11, 

— 19, —43, -67 and -163. 

(ii) Determine all the odd primes that can be expressed in 
the form x 2 + xt/ + 5y z . 

(iii) Determine all the positive integers that can be 
expressed in the form x 2 + 2 y 2 . 

(iv) Determine all the positive integers that can be 
expressed in the form x 2 -y 2 . 

(v) Show that there are precisely two reduced forms with 
discriminant -20. Hence prove that the primes that 
can be represented by x 2 +5y 2 are 5 and those 
congruent to 1 or 9 (mod 20). 
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Calculate ^(-31). 

Find the least positive integer that can be represented 
by 4* 2 +17*|/+20|/ 2 . ' 

Prove that n and 2n, where n is any positive integer, ' 

have the same number of representations as the sum of I 

two squares. j 

Find the least integer s such that n = Xj fc + • • • + x t k , J 

where n = 2*[(3/2)*]- 1 and x x , . . . , x, are positive I 

integers. 

!i 

! I 
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Diophantine approximation 


1 Dirichlet’s theorem 

Diophantine approximation is concerned with the solu- 
bility of inequalities in integers. The simplest result in this field 
was obtained by Dirichlet in 1842. He showed that, for any real 
0 and any integer Q> 1 there exist integers p, q with 0< q < Q 
such that \q$-p\< 1 /Q. 

The result can be derived at once from the so-called ‘box’ or 
‘pigeon-hole’ principle. This asserts that if there are n holes 
containing n + 1 pigeons then there must be at least two pigeons 
in some hole. Consider in fact the (? + l numbers 0, 1, {0}, 
{20}, . . . , {(Q — 1)0}, where {x} denotes the fractional part of x 
as in Chapter 2. These numbers all lie in the interval [0, 1], and 
if one divides the latter, as clearly one can, into Q disjoint 
sub-intervals, each of length l/Q, then it follows that two of the 
@ + 1 numbers must lie in one of the Q sub-intervals. The 
difference between the two numbers has the form qO - p, where 
p , q are integers with 0 <q<Q, and we have | qO - p\ l/Q, as 
required. 

Dirichlet’s theorem holds more generally for any real Q> 1; 
the result for non-integral Q follows from the theorem just 
established with Q replaced by (C?]+l. Further it is clear that 
the integers p, q referred to in the theorem can be chosen to be 
relatively prime. When 0 is irrational we have the important 
corollary that there exist infinitely many rationals p/q (q> 0) 
such that \0-p/q\< l/q 2 . Indeed, for Q> 1, there is a rational 
p/q with \$-p/q\^l/(Qq)<l/q 2 ; moreover, if 0 is irrational 
then, for any Q' exceeding l/|q0 - p\, the rational corresponding 
to Q' will be different from p/q. Note that the corollary does 
not remain valid for rational 0; for if 9-a/b with a, b integers 
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and b > 0 then, when 9 p/q, we have 1 0 — p/q\ s l/(qb) and so 
there are only finitely many rationals p/q such that 1 0 — p/q\< 

1 /q 2 - 

2 Continued fractions 

The continued fraction algorithm sets up a one-one 
correspondence between all irrational 0 and all infinite sets of 
integers a 0 , a lt a 2 , • . . with a lt a 2 ,... positive. It also sets up a 
one-one correspondence between all rational 9 and all finite sets 
of integers a 0 , a u ... t a n with a t , a 2 , . . . , o n _j positive and with 
o n >2. 

To describe the algorithm, let 0 be any real number. We put 
0o = [0]* If 5*0 we write 0 = a o +l/0 u so that 0i>l, and we 
put <jj = [0J. If 0i ^ 0i we write 0i = 0i + 1/ 0 2 , so that 0 2 > 1, and 
we put o 2 = [ 0z3. The process continues indefinitely unless a n = 0„ 
for some n. It is clear that if the latter occurs then 0 is rational; 
in fact we have 

0 = 0 O + • 


Conversely, as will be clear in a moment, if 0 is rational then 
the process terminates. The expression above is called the con- 
tinued fraction for 0; it is customary to write the equation briefly 


0 = a 0 + 

0 i+ 02 + 0 n 

or, more briefly, as 

0 = [0 O , 0i, 02, • • • , 0„]* 

If a„ 9 n for all n, so that the process does not terminate, then 

9 is irrational. We proceed to show that one can then write 

„ 1 1 

0 = 0 o+— 7—7 ' • • , 

01 + 02 + 

or briefly 

0 = [0o» 0i, 02, • • •]• 
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The integers a 0 , 0i, 02 , • • • are known as the partial quotients of 
0 ; the numbers 9 U 0 2 ,... are referred to as the complete quotients 
of 0. We shall prove that the rationals 

Pn/ Qn [00, 01, • • • , 0„1» 

where p„, q„ denote relatively prime integers, tend to 0 as n -» oo; 
they are in fact known as the convergents to 0. 

First we show that the p„, q„ are generated recursively by the 
equations 

Pn = 0#Pw-l + Pn-2» Qn ~ &nQn-l + Qn- 2, 
where Po-o 0 , q 0 = 1 and p t = o« 0 | + 1, = Oj. The recurrences 

plainly hold for n = 2; we assume that they hold for n = m - 1 s 2 
and we proceed to verify them for n - m. We define relatively 
prime integers pj, q) (j = 0, 1, . . .) by 
p;/g; = [0i,0 2 ,...,0 j+ il, 

and we apply the recurrences to p' m . j, q’ m -\\ they give 
p' m _, = a m Pm— 2 + Pm-3, Qm-l = 0m<7m-2+ <7m-3. 

But we have Pj/qj = 0 o +9}-i/Pi-i, whence 
p i = 0op;_i + q5-i,<7 / = p;-i. 

Thus, on taking j = m, we obtain 

Pm = a m (0O Pm — 2 + q' m - 2 ) + a 0 Pm-3 + Qm- 3, 

Qm 0m Pm -2 + Pm— 3, 

and, on taking j = m - 1 and j = m-2, it follows that 

Pm =0mPm-I + Pm-2, Qm = OmQm-l + Qm-2, 
as required. 

Now by the definition of 9 U 0 2t . . . we have 
0 85 [0o, 0i, • • • , 0« + 1 1, 

where 0< l/0 n+ j< l/o n+ i; hence 0 lies between p„/q„ and 
Pn+i/<7n+i» It is readily seen by induction that the above recurren- 
ces give 

PnQn + l Pn + lQn ~ ( 1)” + 1 , 

and thus we have 

\Pn/Qn-Pn + l/qn + l\=l/(qnQn + l)- 
It follows that the convergents p„/q n to 0 satisfy 

\0-Pn/Qn\^l/(QnQn + l), 

and so certainly p n /q„ -* 0 as n -*■ oo. 
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In view of the latter inequality and the remarks at the end of 
§ I, it is now clear that when 0 is rational the continued fraction 
process terminates. Indeed, for rational 0, the process is closely 
j related to Euclid’s algorithm as described in Chapter 1. In fact, 

j if, with the notation of §4 of Chapter 1, we take 0 = a/b and 

a } = q j+l (0 ^ k) then we have 

0 = [flo» Oi , . . . , a*]; 
thus, for example, W = [5, 2, 1, 11]. 

3 Rational approximations 

It follows from the results of § 2 that, for any real 0, each 
convergent p/q satisfies \0- pfq\< l/q 2 . We observe now that, 
of any two consecutive convergents, say p n /q n and p„+i/q„+i, 
one at least satisfies \0- p/q\< l/(2qr 2 ). Indeed, since 6-p n /q n 
and 0 — p n+ i/q n+l have opposite signs, we have 

|0 “ Pn/q n \ + \0 - Pn + l/<7« + l| = \pjq n - P„ + l/<7n + l| 

but, for any real a, f) with a fi, we have a) 3 < |(ar 2 +/3 2 ), whence 
l/(fl f n^n+l)< l/(2<7«)+ 1/(2<7 ® + i), 

and this gives the required result. We observe further that, of 
any three consecutive convergents, say p n /q n , p„+i/g„+i and 
Pn+z/Qn+ 2 , one at least satisfies \0-pfq\< l/(>/5 q 2 ). In fact, if 
the result were false, then the equations above would give 
1/W5 q 2 n ) + l/(v/5 q 2 n+l ) < l/(q n q n+l ), 
that is A + 1/A^V5, where A = q n+l /q„. Since A is rational it 
follows that strict inequality holds and so 
(A -§(1 +>/5))(A +g(l — \/5))<0, 

whence A<|(l+V5). •Similarly, on writing /x = q n +dq n +u we 
would have /a<§( 1 + \/5). But, by §2, we have q n + 2 = 
«t+2<7n+i + <7n, and thus /a > 1 + 1/A; this gives a contradiction, 
for if A <|(1 +\/5) then 1/A >§(-l + >/5). 

The latter result confirms a theorem of Hurwitz to the effect 
that, for any irrational 0 , there exist infinitely many rational p/q 
such that \0 - p/q\ < 1/(V5 q z ). The constant l/-s/5 is best possible, 

! j as can be verified (see § 5) by taking 

0 = |<lW5) = [l,l,l....]. 
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However if one excludes all irrationals equivalent to 0, that is 
those whose continued fractions have all but finitely many partial 
quotients equal to 1, then Hurwitz’s theorem holds with l/-v/8 
in plpce of l/>/5, and this is again best possible. There is an 
infinite sequence of such results, with constants tending to 1/3, 
and they constitute the so-called Markoff chain. 

We note next that the convergents give successively closer 
approximations to 0. In fact we have the stronger result that 
| q n 0 - p „ | decreases as n increases. To verify this, we observe that 
the recurrences in § 2 hold for any indeterminates a 0 , a,, . . . , 
whence, for n s 1, we have 
„ P„0 n+ l + P n -l 

U hi > 

QnVn + l + Qn-l 

thus we obtain 

I Qn& ~ Pn\~ l/(ffn^n + l 

and the assertion follows since, for n> 1, the denominator on 
the right exceeds 

q n + q„_, = (a„ + l)q n _, + q„_ 2 > q n . x 0 n + q n - 2 , 

and, for n = 1, it exceeds 0 X . The argument here shows, inciden- 
tally, that the convergents to 0 satisfy 

1 

„ _2“ 
fl n+l Qn 

The convergents are indeed the best approximations to 0 in 
the sense that, if p, q are integers with 0 < q < q n + x then \q0 - p\ s 
\q n 0-p n \. For if we define integers u, v by 
p=up n + vp n+l , q = uq n + vq„ +x , 
then it is easily seen that u 0 and that, if v ^ 0, then u, v have 
opposite signs; hence, since q„0~p„ and q n +\0~ Pn+\ have 
opposite signs, we obtain 

\q0 - p\ = \u(q n 0 - p n ) + v(q„+ l 0 — p n + I )| 

^\q„0-p„l 

as required. As a corollary, we deduce that if a rational p/q 
satisfies \0-p/q\<l/(2q 2 ) then it is a convergent to 0. In fact 
we have p/q = p n /q n , where q n ^q< q n+1 ; for clearly 

I P/Q - Pnt Qn I ^ \0 - P/q\ + |0 - Pn/Qn\ 

<(l/q + l/q n )\q0~p\, 
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and, since q 2 : q n and \qO - p\ < l/(2q), the number on the right 
is less than l/(qq„); hence the number on the left vanishes, as 
required. 

To conclude this section we remark that, for almost all real 0 
in the sense of Lebesgue measure, the inequality \0-p/q\< 
l/(q 2 log q) has infinitely many rational solutions p/q\ in fact 
the same applies to the inequality \0~p/q\<f{q)/q, where / is 
any monotonically decreasing function such that £ f(q) diverges. 
However, almost no 0 have the property if 'E.f(q) converges, for 
instance if f(q) = l/(q(log q) l+ *) with 8 > 0. 

4 Quadratic irrationals 

By a quadratic irrational we mean a zero of a polynomial 
ax 2 + bx + c, where a, b, c are integers and the discriminant 
d = b 2 — 4 ac is positive and not a perfect square. One of the most 
remarkable results in the theory of numbers, known since the 
time of Lagrange, is that a continued fraction represents a quad- 
ratic irrational if and only if it is ultimately periodic, that is, if 
and only if the partial quotients a 0 , a u . . . satisfy a m+n = a n for 
some positive integer m and for all sufficiently large n. Thus a 
continued fraction 0 is a quadratic irrational if and only if it 
has the form 

^ [®0» ®1> • • • * ®Jt— 1» * • • > lL 

where the bar indicates that the block of partial quotients is 
repeated indefinitely. As examples, we have V2 = [1, 2] and s(3 + 

V3) = u, 1,1723- 

It is easy to see that if the continued fraction for 0 has the 
above form then 0 is a quadratic irrational. For the number 

<t> = • • • » 

is a complete quotient of 0 and so, by § 3, we have, for k ^ 2, 

- Pk-l<f> + Pk-2 

” i , > 

qk-i<P + Qk -2 

where p„/q n (n = 0, 1, . . .) are the convergents to 0; further we 
have, for m 2: 2, 

Pm-l^ + Pm-2 

q'n-rf + q'n-2 
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where p'Jq' n (n = 0, 1, . . .) are the convergents to <f>. It is clear 
from the latter equation that <t> is quadratic and hence, by the 
preceding equation, so also is 0; and this plainly remains valid 
for k = 0 and 1, and for m- 1. Since the continued fraction for 
0 does not terminate, it follows that 0 is a quadratic irrational, 
as required. 

To prove the converse, suppose that 0 is a quadratic irrational 
so that 0 satisfies an equation ax 2 + bx + c = 0, where a, b , c are 
integers with d = b 2 -4ac > 0. We shall consider the binary form 
f(x , y) s =ax 2 +bxy+cy 2 . 

The substitution 

x = p n x'+p n _ l y\ y=q n x' + q n _ x y\ 

where p„/q„ (n - 1, 2, . . .) denote the convergents to 0, has deter- 
minant 

Pnq n -l~ Pn-lQn = ( — 1)" \ 

and so, as in § I of Chapter 5, we see that it takes / into a binary 
form 

/„(*, y) = o n x 2 + b n xy + c„y 2 

with the same discriminant d as /. Further we have a n p n , q n ) 
and c n Now /(<?, 1) = 0 and so 

a»/q» 2 = f(Pn/qn , i)-/(0, 1) 

= <*(< Pn/q n ) 2 ~ 0 2 ) + H( p«/q„)- 0). 

By § 2 we have \0 - p„/q „ | < l/q 2 , whence 

\0 2 -( pjq n f\ < \0 + pjq n \/q n 2 <(2\0\ + l)/q 2 - 

Thus we see that 

k|<(2|»|+l)|a|+|H 

that is, a n is bounded independently of n. Since c„ =o„_ , and 
b 2 -4a„c„ = d, it follows that b„ and c„ are likewise bounded. 
But, for n 2r 1, we have 

g p n g w +i + Pn-i 

<7,A + l + <7n-l’ 

where 0 \, 02 ,... denote the complete quotients of 0, and so 
/n(^n+ i»l) = 0. This implies that there are only finitely many 
possibilities for 0 t , 0 2 , . . . , whence 0i +m * 0 / for some positive l , 
m. Hence the continued fraction for 0 is ultimately periodic, as 
required. 
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The continued fraction of a quadratic irrational 0 is said to 
be purely periodic if k = 0 in the expression indicated above. It 
is easy to show that this occurs if and only if 0 > 1 and the 
conjugate 0' of 0 , that is, the other root of the quadratic equation 
defining 0 \ satisfies -1<0'<O. Indeed if 0>1 and -1<0'<O 
then it is readily verified by induction that the conjugates 0J, of 
the complete quotients 0 n (n = 1,2,.. .) of 0 likewise satisfy 
-1<0J,<O; one needs to refer only to the relation 0’ n = 
a n + 1/0J.+I, where 0 = [a 0 , a lt . . .], together with the fact that 
a n > 1 for all n including n = 0. The inequality -1 < 0 f n < 0 shows 
that a n = [— \/0' n+l ]. Now since 0 is a quadratic irrational we 
have 0 m = 0 n for some distinct m, n ; but this gives l/0' m = l/0 f „, 
whence a m _j = a n _ 1 . It follows that 0„,_ 1 = 0 n _i, and repetition 
of this conclusion yields 0 = 0 n _ m , assuming that n > m. Hence 
0 is purely periodic. Conversely, if 0 is purely periodic, then 
0> a 0 2z 1. Further, for some n^l, we have 
PnO + Pn - 1 

^ £\ . J 

q„0 + q n -i 

where p n /q n (n = 1, 2, . . .) denote the convergents to 0, and thus 
0 satisfies the equation 

q n x z +(q n ~ i “ Pn)x - p „- 1 = 0. 

Now the quadratic on the left has the value -p„-i <0 for x = 0, 
and it has the value p„ + q n -(p n _i + q„_i)>0 for x = -l. Hence 
the conjugate 0' of 0 satisfies -1 < 0'<O, as required. 

As an immediate corollary we see that the continued fractions 
of \/d + [y/d] and 1 /{\/d — [y/d]) are purely periodic, where d is 
any positive integer, not a perfect square. Moreover this implies 
that the continued fraction of >Jd is almost purely periodic in 
the sense that, here, k = 1 . The convergents to V d, incidentally, 
are closely related to the solutions of the Pell equation about 
which we shall speak in Chapter 8. 

5 Liouville’s theorem 

The work of § 4 shows that every quadratic irrational 0 
has bounded partial quotients. It follows from the results of § 3 
that there exists a number c = c(0)> 0 such that the inequality 
\0-p/q\> c/q 2 holds for all rationals p/q (q> 0). Liouville 
proved in 1844 that a theorem of the latter kind is valid more 


generally for any algebraic irrational, and his discovery led to 
the first demonstration of the existence of transcendental 
numbers. 

A real or complex number is said to be algebraic if it is a zero 
of a polynomial 

P(x) = a 0 x n + a l x n ~ l + - • +a n , 

where a 0 , a,, . . . , a n denote integers, not all 0. For each algebraic 
number a there is a polynomial P as above, with least degree, 
such that P(a) = 0, and P is unique if one assumes that a o >0 
and that a 0 , a,, . . . , a n are relatively prime; obviously P is 
irreducible over the rationals, and it is called the minimal poly- 
nomial for cr. The degree of a is defined as the degree of P. 

Liouville’s theorem states that for any algebraic number a 
with degree n>l there exists a number c = c(a)>0 such that 
the inequality \a - p/q | > c/q n holds for all rationals p/q(q> 0). 
For the proof, we shall assume, as clearly we may, that a is real, 
and we shall apply the mean-value theorem to P, the minimal 
polynomial for a. We have, for any rational p/q (q> 0), 

P(a)-P(p/q) = (a-p/q)PW, 

where P'(x) denotes the derivative of P, and f lies between a 
and p/q. Now we have P(o) = 0 and, since P is irreducible, we 
have also P(p/q) 5*0. But q n P(p/q ) is an integer and so 
|P( p/q) | s£ 1 /q n . We can suppose that |a - p/q\ < 1, for otherwise 
the theorem certainly holds; then we have |£|<|a|+l and so 
|P'(£)| < C for some C = C(a). This gives \a - p/q\> c/q n , where 
c = 1/C, as required. 

The proof here enables one to furnish an explicit value for c 
in terms of the degree of P and its coefficients. Let us use this 
observation to confirm the assertion made in § 3 concerning 
a =i(l + V5). In this case we have P(x) = x 2 -x~ 1 and so P'(x)- 
2x-l. Let p/q(q>0) be any rational and let S = \a - p/q\. Then 
|P(p/q)M|P'(£)|for some £ between a and p/q. Now clearly 
|^| < a + 8 and so 

|P(£)| — 2(o + 5) - 1 = 25 + >/5. 

But |P( p/q)\> l/q 2 , whence 6(25 + \/5)^ 1 fq 2 . This implies that 
for any c' with c'< l/\/5 and for all sufficiently large q we have 
0> c'/q 2 . Hence Hurwitz’s theorem (see §3) is best possible. 
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A real or complex number that is not algebraic is said to be 
transcendental. It is now easy to give an example; consider, in 
fact, the series 

0 = 2~ u + 2 -21 + 2 -31 + • • • . 

If we put 

Pj=*2 j, (2~ 1, + 2~ 2, + ' • •+2 _it ), 

<fc = 2'' ()- 1,2,...), 

then p h q } are integers, and we have 

But the sum on the right is at most 

2~( i +1 >*( 1 + 2 _1 + 2 -2 + • • •) — 2~ li+l)t+l <qJ i , 
and it follows readily from Liouville’s theorem that 0 is transcen- 
dental. Indeed any real number 0 for which there exists an 
infinite sequence of distinct rationals p } / q f satisfying \d - p } /q f \ < 
1/ where (a f -> oo as j -* oo, will be transcendental. For instance, 
this will hold for any infinite decimal in which there occur 
sufficiently long blocks of zeros or any continued fraction in 
which the partial quotients increase sufficiently rapidly. 

There have been some remarkable improvements on 
Liouville’s theorem, beginning with a famous work of Thue in 
1909. He showed that for any algebraic number a with degree 
n > 1 and for any k > + 1 there exists c - c(a, k)>0 such that 

l«""P/<j|> c/q" for all rationals p/q (q> 0). The condition on 
k was relaxed by Siegel in 1921 to k > 2 >/n and it was further 
relaxed by Dyson and Gelfond, independently, in 1947 to «> 
V(2n). Finally Roth proved in 1955 that it is enough to take 
k>2, and this is plainly best possible. There is an intimate 
connection between such results and the theory of Diophantine 
equations (see Chapter 8). In this context it is important to know 
whether the numbers c(a, k) can be evaluated explicitly, that 
is, whether the results are effective. In fact all the improvements 
on Liouville’s theorem referred to above are, in that sense, 
ineffective; for they involve a hypothetical assumption, made at 
the outset, that the inequalities in question have at least one 
large solution. Nevertheless effective results have been success- 
fully obtained for particular algebraic numbers; for instance 
Baker proved in 1964 from properties of hypergeometric func- 


tions that, for all rationals p/q (q> 0), we have 

|V2-p/q|>10- 6 /g 2955 . 

Moreover, a small but general effective improvement on 
Liouville’s theorem, that is, valid for any algebraic a, has been 
established by way of the theory of linear forms in logarithms 
referred to in the next section. 

6 Transcendental numbers 

In 1873 Hermite began a new era in number theory 
when he succeeded in proving that e, the natural base for 
logarithms, is transcendental. It had earlier been established that 
e was neither rational nor quadratic irrational; indeed the con- 
tinued fraction for e was known, namely 

e = [2, 1,2, 1,1, 4, 1,1, 6, 1,1, 8....J. 

But Hermite’s work rested on quite different ideas concerning 
the approximation of analytic functions by rational functions. 
In 1882 Lindemann found a generalization of Hermite's argu- 
ment and he obtained thereby his famous proof of the transcen- 
dence of ir. This sufficed to solve the ancient Greek problem of 
constructing, with ruler and compasses only, a square with area 
equal to that of a given circle. In fact, given a unit length, all 
the points in the plane that are capable of construction are given 
by the intersection of lines and circles, whence their co-ordinates 
\ in a suitable frame of reference are algebraic numbers. Hence 

the transcendence of n implies that the length y/ir cannot be 
classically constructed and so the quadrature of the circle is 
impossible. Lindemann actually proved that for any distinct 
algebraic numbers a,, . . . , a„ and any non-zero algebraic num- 
I bers Pi t . . . , we have 

p x e"» + - - + /3 n e"-#0. 

The transcendence of tt follows in view of Euler’s identity 
, e ,n = — 1; and the result plainly includes also the transcendence 

of e, of log a for algebraic a not 0 or 1, and of the trigonometrical 
functions cos a, sin a and tan a for all non-zero algebraic a. 

In the sense of Lebesgue measure, ‘almost all’ numbers are 
{ transcendental; in fact as Cantor observed in 1874, the set of ail 

algebraic numbers is countable. However, it has proved 
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notoriously difficult to demonstrate the transcendence of par- 
ticular numbers; for instance, Euler’s constant y has resisted any 
attack, and the same applies to the values £(2 n + 1) (n = 1, 2, . . .) 
of the Riemann zeta-function, though Apery has recently demon- 
strated that £(3) is irrational. In 1900, Hilbert raised as the 
seventh of his famous list of 23 problems, the question of proving 
the transcendence of Z J2 and, more generally, that of a p for j 

algebraic a not 0 or 1 and algebraic irrational f$. Hilbert i 

expressed the opinion that a solution lay farther in the future 
than the Riemann hypothesis or Fermat’s last theorem. But 
remarkably, in 1929, following studies on integral integer- valued 
functions, Gelfond succeeded in verifying the special case that 
e" = (—1) * is transcendental, and a complete solution to Hilbert’s 
seventh problem was established by Gelfond and Schneider 
independently in 1934. A generalization of the Gelfond- 
Schneider theorem was obtained by Baker in 1966; this fur- * 

nished, for instance, the transcendence of e fio ai 0t • • • a n \ and \ 

indeed that of any non-vanishing linear form 

0i log «i + * * * + 0 n log a„, j 

where the as and /3s denote non-zero algebraic numbers. The j 

work enabled quantitative versions of the results to be estab- 
lished, giving positive lower bounds for . linear forms in 
logarithms, and these have played a crucial role in the effective 
solution of a wide variety of Diophantine problems. We have 
already referred to one such application at the end of § 5; we 
shall mention some others later. 

Several classical functions, apart from e*, have been shown to 
assume transcendental values at non-zero algebraic values of the 
argument; these include the Weierstrass elliptic function P(z), ! 

the Bessel function J 0 (z) and the elliptic modular function j(z ), 
where, in the latter case, z is necessarily neither real nor 
imaginary quadratic. In fact there is now a rich and fertile theory 
relating to the transcendence and algebraic independence of 
values assumed by analytic functions, and we refer to § 8 for an 
introduction to the literature. 

To illustrate a few of the basic techniques of the theory, we \ 

give now a short proof of the transcendence of e; the argument 
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can be extended quite easily to furnish the transcendence of it 
and indeed the general Lindemann theorem. The proof depends 
on properties of the integral 

/<*) = fVyWd*, 

Jo 

defined for * 2 : 0 , where / is a real polynomial with degree m. 
By integration by parts we have 

/(*) = e‘ l / ( '>(0)- I f li) (t ), 

i-0 i = o 

where f {i \x) denotes the jth derivative of f(x). Further we 
observe that, if / denotes the polynomial obtained from / by 
replacing each coefficient with its absolute value, then 

|/(0N f , |e , -/(*)|d*sfe , /(l). 

Jo 

Suppose now that e is algebraic, so that 
a 0 + aie+* • • + a n e" = 0 

for some integers fl<>» fli , . . . , a„ with a o ^0. We put 
f(x) = x p -'(x-iy>'ix-nr, 

where p is a large prime; then the degree m of / is (n + l)p- 1. 
We shall compare estimates for 

/ = a o /(0) + ai J(l) + - • + a„/(n). 

By the above equations we see that 

/ = -! £ a k f«\k). 

j=0 k-O 

Now, when 1 < k < n, we have f u \k) — 0 for j < p, and 

for j 2 : p, where g(x) = f(x)f(x - k) v . Thus, for all j, f {, \k) is an 
integer divisible by p!. Further, we have f u) ( 0) = 0 for j< p- 1, 
and 

for j >r p — 1 , where h{x)~ f{x)/x p ~ x . Clearly h u \ 0) is an integer 
divisible by p for/>0,and h(0) = (-l) np (n\) p . Thus, for j* p- 1, 
f {,} ( 0) is an integer divisible by pi, and f ip l) (0) is an integer 
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divisible by ( p — 1)! but not by p for p> n. It follows that / is 
a non-zero integer divisible by (p- 1)1, whence |/|^(p-l)I. On 
the other hand, the trivial estimates /(k)<(2n) m and m^2np 
give 

I/I — l fl i|e/(l) + ' ' ' + |a„|n e"/(n)< c p 

for some c independent of p. The inequalities are inconsistent 
for p sufficiently large, and the contradiction shows that e is 
transcendental, as required. 

7 Minkowski’s theorem 

Practically intuitive deductions relating to the geometry 
of figures in the plane, or, more generally, in Euclidean n-space, 
can sometimes yield results of great importance in number 
theory. It was Minkowski who first systematically exploited this 
observation and he called the resulting study the Geometry of 
Numbers. The most famous theorem in this context is the convex 
body theorem that Minkowski obtained in 1896. By a convex 
body we mean a bounded, open set of points in Euclidean 
n-space that contains A* + (l-A)|/ for all A with 0<A<1 
whenever it contains x and y. A set of points is said to be 
symmetric about the origin if it contains —x whenever it contains 
x. The simplest form of Minkowski’s theorem asserts that if a 
convex body 5^, symmetric about the origin, has volume exceed- 
ing 2" then it contains an integer point other than the origin. 
By an integer point we mean a point all of whose co-ordinates 
are integers. 

For the proof, it will suffice to verify the following result due 
to Blichfeldt: any bounded region 9? with volume V exceeding 
1 contains distinct points x, y such that x— y is an integer point. 
Minkowski’s theorem follows on taking 0t=^Sf, that is the set 
of points |jc with x in y, and noting that if x and y belong to 

then 2x and -2 y belong to whence x-y- |(2x - 2 y) also 
belongs to 0 > . To prove Blichfeldt’s result, we note that 0t is the 
union of disjoint subsets $ti, where u =(u u .... u n ) runs through 
all integer points and 0tu denotes the part of 0t that lies in the 
interval u j <x J <u J + 1 ( 1 ^ j < n). Thus V = £ Vu, where Vu 
denotes the volume of &tu, and, by hypothesis, we obtain £ Vu > 
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l. It follows that if each of the regions 0lu is translated by -u 

so as to lie in the interval 0^ x } < 1 (1 < n), then at least two 

of the translated regions, say the translates of 0tu and 0tv, must 
overlap. Hence there exist points x in 0tu and y in such that 
x-u = y-v, and so x - y is an integer point, as required. 

In order to state the more general form of Minkowski’s theorem 
we need the concept of a lattice. First we recall that points 
a lt . . . , a n in Euclidean n-space are said to be linearly indepen- 
dent if the only real numbers t u ... $ t n satisfying t t a t + — + 
t n a„ - 0 are <, = ••• = t n - 0; this is equivalent to the condition 
that d = det (a tf ) ^ 0, where a f = (a if , . , . , a nj ). By a lattice A we 
mean a set of points of the form 
x = u,a, + - • +M„a n , 

where a lt . . . , a n are fixed linearly independent points and 
u,, . . . , ti„ run through all the integers. The determinant of A is 
defined as d(A) = |d|. With this notation, the general Minkowski 
theorem asserts that if, for any lattice A, a convex body 
symmetric about the origin, has volume exceeding 2"d(A), then 
it'contains a point of A other than the origin. The result can be 
established by simple modifications to the earlier arguments. 

As an immediate application, let Aj, . . . , A„ be positive num- 
bers and let 5^ be the convex body \x f \ < \ f (1 < ; < n); the volume 
of 0* is 2"A, — A n . Thus, on writing 
L, = tiiflji + • - • + u„a fn (1 

we deduce that if A t • - - A„ > d(\) then there exist integers 
t*i, ... , u„, not all 0, such that \L ,\ < A y (1 n). This is referred 

to as Minkowski’s linear forms theorem. It can be sharpened 
slightly to show that if A| • ~-A„ = d( A) then there exist integers 
«i, ...» t* m not all 0, such that |L||< A| and \L t \< (2 < n). 

In fact, for each m = 1, 2, . . . there exists a non-zero integer point 
u m for which |Li|<A! + l/m and iL^Ay (2 n); but the u m 
are bounded, and so «* m = u for some fixed t* and infinitely many 

m, whence u =(ti t , .... ti„) has the required properties. 
Minkowski’s linear forms theorem implies that if 0 U . . . , 0 n 

are any real numbers and if Q>0 then there exist integers p, 
q t , . . . , q m not all 0, such that |fy|< Q (1 ^ n) and 

|q,0, + - "• + q n O n -p\^Q~ n - 
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Similarly we see that there exist integers p \> . . . , p m q, not all 0, 
such that |q| < Q n and \q6 } - p,| < \/Q (1 < j ^ n). It follows that, 
if one at least of 0 lt . . . , 0„ is irrational, then 
\0 j ~p } /q\<q~ l ~ 1/n (1 — i— n) 

for infinitely many rationals pj q(q> 0). These results generalize 
Dirichlet’s theorem discussed in § 1. In the opposite direction, 
it is easy to extend the observation on quadratic irrationals made 
in § 5 to show that, when 0 is an algebraic number with degree 
n + 1, there exists c = c(0 ) > 0 such that 

|<7i0 + - • • + q n $ n - p\> cq~ n 

for all integers p , q lt . . . , q n with q = max \qj\> 0. This implies, 
by a classical transference principle, that the exponent —1 - 1/n 
above is best possible. It is known from transcendence theory 
that, for any e > 0, there exists c > 0 such that 

|<7t e + •••• + q n e n - p\ > cq~ n ~ e 

for all integers p, q lt . . . , q n with q = max |fy| > 0. Moreover, some 
deep work of Schmidt, generalizing the Thue-Siegel-Roth 
theorem, shows that the same holds when e, . . . , e" are replaced 
by algebraic numbers 0j, . . . , 0 n with 1, 0j, . . . , 0 n linearly 
independent over the rationals; in analogy with lattice points, 
we say that real numbers <f> u . . . , <f> m are linearly independent 
over the rationals if the only rationals t u . . . , t m satisfying ty<f>y + 

• * * + = 0 are <! = ••• = = 0. 

Minkowski conjectured that if Lj, . . . , L n are linear forms as 
above and if 0j, . . . , 0 n are any real numbers then there exist 
integers Uj, . . . , u n such that 

At present the conjecture remains open, but it is trivial in the 
case n = 1 and Minkowski himself proved that it is valid in the 
case n = 2. It has subsequently been verified for n = 3, 4 and 5, 
and Tchebotarev showed that it holds for all n if 2 -n is replaced 
by 2~* n . Minkowski’s work furnished a result to the effect that 
if 0 is irrational and 0’ is not of the form r0 + s for integers r, s, 
then there are infinitely many integers q ^ 0 such that, for some 
integer p, 

\q0-p-e'\<l/(4\q\); 


and here the constant 1/4 is best possible. The result implies 
that the numbers {n0}, where n - 1, 2, ... , are dense in the unit 
interval, that is, for every 0' with 0< 0'< 1, and for every e >0, 
we have |{ n0} — 0'j < e for some n. A famous theorem of Kronecker 
implies that, more generally, the points 
({n0,} • • • {n0 m }) (n = 1 , 2, . . .), 
where 1 , 0j, . . . , 0 m are linearly independent over the rationals, 
are dense in the unit cube in Euclidean m-space. 

8 Further reading 

The classic text on continued fractions is Perron’s Die 
Lehre von den Kettenbriichen (Teubner, Leipzig, 1913). There 
are, however, useful accounts in most introductory works on 
number theory; see, in particular, Cassels* An introduction to 
Diophantine approximation (Cambridge U.P., 1957), and the 
books of Niven and Zuckerman (Wiley, New York, 1966) and 
of Hardy and Wright (Oxford, 5th edn, 1979) cited earlier. A 
nice, short work is Khintchine’s Kettenbriiche (Teubner, Leipzig, 
1956). 

Numerous references to the literature relating to § 5 and § 6 
are given in Baker's Transcendental number theory (Cambridge 
U.P., 2nd edn, 1979). For advanced work concerning rational 
approximations to algebraic numbers see W. M. Schmidt, 
Diophantine approximation (Springer Math. Lecture Notes, 785, 
Berlin 1980). The topics referred to in § 7 are discussed fully in 
Cassels' An introduction to the geometry of numbers (Springer- 
Verlag, Berlin, 1971). 

9 Exercises 

(i) Evaluate the continued fraction [1, 2, 3, 1, 4J. 

(ii) Assuming that ir is given by 3.141 592 6 . . . , correct to 
seven decimal places, prove that the first three 
convergents to ir are ? t 2 , and fff. Verify that 

k-fBl<io- 6 . 

(iii) Let 0, 0' be the roots of the equation x z -ax-l =0, 
where a is a positive integer and 0>O. Show that the 
denominators in the convergents to 0 are given by 
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Qn-i ~(8 n — 0' n )/(0— O'). Verify that the Fibonacci 
sequence 1, 1, 2, 3, 5, . . . is given by q 0 , q i, ... in a 
special case. 

(iv) Prove that the denominators q n in the convergents to 
any real 8 satisfy q„ 2:(|(1 +V5)) n_1 . Prove also that, if 
the partial quotients are bounded above by a constant 
A, then q n <(£(A+V(A 2 +4))) n . 

(v) Assuming that the continued fraction for e is as 
quoted in § 6, show that \e- pf q\> c/(q 2 log q) for all 
rationals p/q ( q> 1), where c is a positive constant. 

(vi) Assuming the Thue-Siegel-Roth theorem, show that 
the sum a~ b + a~ b * + a ~ b3 +. . . is transcendental for any 
integers a 2= 2, b ^ 3. 

(vii) Let a, f$, y, 8 be real numbers with A = aS -/3y # 0. 
Prove that there exist integers x, y, not both 0, such 
that |L| + |M|<(2|A|) I/2 , where L^ax + Py and M = 
yx + 8y. Deduce the inequality iLMj^jjAj is soluble 
non-trivially. 

(viii) With the same notation, prove that the inequality L 2 + 
M 2 ^(4/ir)|A| is soluble in integers x, y, not both 0. 
Verify that the constant 4/tt cannot be replaced by a 
number smaller than (4/3)*. 

(ix) Assuming Kronecker’s theorem and the transcendence 
of e w , show that, for any primes p u . . . , p m , there exists 
an integer n > 0 such that cos (log p") ^ - 2 for j = 

1, 2, . . . , m. 
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1 Algebraic number fields 

Although we shall be concerned in the sequel only with 
quadratic fields, we shall nevertheless begin with a short dis- 
cussion of the more general concept of an algebraic number 
field. The theory relating to such fields has arisen from attempts 
to solve Fermat’s last theorem and it is one of the most beautiful 
and profound in mathematics. 

Let a be an algebraic number with degree n and let P be the 
minimal polynomial for a (see § 5 of Chapter 6). By the conju- 
gates of a we mean the zeros a,, . . . , a n of P. The algebraic 
number field k generated by a over the rationals O is defined 
as the set of numbers £>(ar), where Q(x) is any polynomial with 
rational coefficients; the set can be regarded as being embedded 
in the complex number field C and thus its elements are subject 
to the usual operations of addition and multiplication. To verify 
that k is indeed a field we have to show that every non-zero 
element Q(a) has an inverse. Now, if P is the minimal poly- 
nomial for a as above, then P, Q are relatively prime and so 
there exist polynomials 7?, S such that PS+QR = 1 identically, 
that is for all x. On putting x = a this gives H(a)= l/Q(a), as 
required. The field k is said to have degree n over Q, and one 
writes [Ar : Q] = n. 

The construction can be continued analogously to furnish, for 
every algebraic number field k and every algebraic number /9, 
a field K = k(j 3) with elements given by polynomials in P with 
coefficients in k. The degree [K : k] of K over k is defined in the 
obvious way as the degree of P over k. Now K is in fact an 
algebraic number field over Q, for it can be shown that K = Q(y), 
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where y = ua + o/3 for some rationals u , v ; thus we have 
[K:k][k:Q] = [K:Q). 

An algebraic number is said to be an algebraic integer if the 
coefficient of the highest power of x in the minimal polynomial 
P is 1. The algebraic integers in an algebraic number field k 
form a ring R. The ring has an integral basis, that is, there exist 
elements <o u ...,to n in R such that every element in R can be 
expressed uniquely in the form u l a) l + ' • • + «„<*>„ for some 
rational integers u u . . . , u n . We can write a> ( = p,(a), where p { 
denotes a polynomial over Q, and it is then readily verified that 
the number (det pda/)) 2 is a rational integer independent of the 
choice of basis; it is called the discriminant of k and it turns 
out to be an important invariant. 

An algebraic integer a is said to be divisible by an algebraic 
integer /3 if a//3 is an algebraic integer. An algebraic integer e 
is said to be a unit if 1/e is an algebraic integer. Suppose now 
that R is the ring of algebraic integers in a number field k. Two 
elements a, /3 of R are said to be associates if a - efi for some 
unit e, and this is an equivalence relation on i?. An element a 
of R is said to be irreducible if every divisor of a in R is either 
an associate or a unit. One calls R a unique factorization domain 
if every element of R can be expressed essentially uniquely as 
a product of irreducible elements. The fundamental theorem of 
arithmetic asserts that the ring of integers in k — Q has this 
property; but it does not hold for every k. Nevertheless, it is 
known from pioneering studies of Kummer and Dedekind that 
a unique factorization property can be restored by the introduc- 
tion of ideals, and this forms the central theme of algebraic 
number theory. The work on Fermat’s last theorem that moti- 
vated much of the subject related to the particular case of the 
cyclotomic field Q(£) where ( is a root of unity. 

2 The quadratic field 

Let d be a square-free integer, positive or negative, but 
not 1. The quadratic field Q (y/d) is the set of all numbers of the 

form u + vy/d with rational u, v , subject to the usual operations 

of addition and multiplication. For any element a = u + vy/d in 
Q {>/d) one defines the norm of a as the rational number JV(a) = 
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u 2 * -dv 2 . Clearly N(a) — ad, where a - u- vy/d; a is called the 
conjugate of a. Now for any elements a, /3 in Q (y/d) we see that 
a/3 = d/3 and thus we have the important formula N(a)N(/3) = 
N(afi). It is readily verified that Q (y/d) is indeed a field; in 
particular, the inverse of any non-zero element a is a/N(a). 
The special field Q(\/(— 1)) is called the Gaussian field and it is 
customary to express its elements in the form u + it); in this case 
we have N(a) = u 2 + v 2 and so the product formula is precisely 
the identity referred to in § 4 of Chapter 5. 

We proceed now to determine the algebraic integers in Q(y/d). 
Suppose that a = u + vy/d is such an integer and let a - 2u, b = 2v. 
Then a is a zero of the polynomial P(x) = x 2 - ax + c, where 
c = JV(a), and so the rational numbers a, c must in fact be 
integers. We have 4 c = a 2 -b 2 d and, since d is square-free, it 
follows that also b is a rational integer. Now if d s 2 or 3 (mod 4) 
then, since a square is congruent to 0 or 1 (mod 4), we see that 
a, b are even and thus u, v are rational integers; hence, in this 
case, an integral basis for Q (y/d) is given by 1, y/d. If d = 
1 (mod 4), which is the only other possibility, then a^b (mod 2) 
and thus is a rational integer; recalling that o -\b, we 

conclude that, in this case, an integral basis for Q (y/d) is given 
by 1, |(1 +y/d). The discriminant D of Q(y/d), as defined in § 1, 
is therefore 4d when d^ 2 or 3 (mod 4) and it is d when d ® 
1 (mod 4). 

There is a close analogy between the theory of quadratic fields 
and the theory of binary quadratic forms as described in Chapter 
5. In particular, the discriminant D of Q (Jd) is congruent to 0 
or l(mod4) and so D is also the discriminant of a binary 
quadratic form. Now if a is any algebraic integer in Q (y/d) then, 
for some rational integers x, y, we have a = x + yy/d when d s* 2 
or 3 (mod 4) and ot — x + |t/(l +y/d) when d s 1 (mod 4). Thus we 
see that N(a) = F(x, y ), where F denotes the principal form with 
discriminant D, that is x 2 -dy 2 when Ds0(mod4) and (x + 

2 y) 2 ~\dy 2 when D 5 * 1 (mod 4). 

3 Units 

By a unit in Q Wd) we mean an algebraic integer e in 
Q(y/ d) such that 1 / e is an algebraic integer. Plainly if e is a unit 
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then N(e) and N(l/e) are rational integers and, since 
N(e)N( 1/e) = 1, we see that N(e) = ±1. Conversely, if N(e) — ± 1, 
then ee = ±1 and so e is a unit. Thus, by the above remarks, the 
units in Q(>Jd) are determined by the integer solutions x, y of 
the equation F(x, y) = ±l. 

We shall distinguish two cases according as d < 0 or d > 0; in 
the first case the quadratic fiqld is said to be imaginary and in 
the second it is said to be real. Now in an imaginary quadratic 
field there are only finitely many units. In fact if d<- 3 then, 
as is readily verified, the equation F(x,y) = ± 1 has only the 
solutions x = ±l, y- 0 and so the only units in Q(y/d) are ±1. 

For d = — 1, that is, for the Gaussian field, we have F(x , y)- 
x 2 +y 2 and there are therefore four units ±1, ±i. For d = -3 we , 

have F(x, y) - x z + xy + y 2 and the equation F(x , y) = ±1 has six 
solutions, namely (±1,0), (0, ±1), (1, -1) and (—1, 1); thus the 
units of Q(\/(-3)) are ±1, i(±l, ±V(-3)). It follows that the units | 

in an imaginary quadratic field are all roots of unity; they are 
given by the zeros of x 2 — 1 when D<- 4, by those of x 4 -l 
when D = - 4 and by those of % 6 -l when D = -3. Hence the 
number of units is the same as the number w for forms with 
discriminant D indicated in § 3 of Chapter 5. 

We turn now to real quadratic fields; in this case there are 
infinitely many units. To establish the result it suffices to show 
that, when d> 0, there is a unit 17 in Q(y/d) other than ±1; for . 

then 77 m is a unit for all integers m, and, since the only roots of 
unity in Q(y/d) are ±1, we see that different m give distinct units. 

We shall use Dirichlet’s theorem on Diophantine approximation 
(see § 1 of Chapter 6); the theorem implies that, for any integer 
Q> 1, there exist rational integers p, q, with 0 < q < Q, such that ; 

|a|< l/Q, where a = p-qy/d. Now the conjugate a = a+2qy/d ( 

satisfies |a|<3QVd and thus we have |iV(a)|<3\/d. Further, \ 

since y/d is irrational, we obtain, as Q-*<x>, infinitely many a 
with this property. But N(a) is a rational integer bounded j 

independently of Q, and thus, for infinitely many a, it takes ! 

some fixed value, say N. Moreover we can select two distinct 
elements from the infinite set, say a-p — q>J d and a' - p' — q'y/d, 
such that p ® p' (mod N) and q ** q’ (mod N). We now put 77 = 1 

a/a'. Then N(q)= N(a)/N(a')= 1. Further, 77 is clearly not 1, 


and it is also not —1 since y/d is irrational and q, q' are positive. 
Furthermore we have 77 = x + yy/d , where x = ( pp'-dqq')/N and 
y = ( pq'—p'q)/ N, and the congruences above imply that x, y are 
rational integers. Thus 77 is a non-trivial unit in Q(n /d), as 
required. The argument here shows, incidentally, that the Pell 
equation x 2 -dy 2 = 1 has a non-triviai solution; we shall discuss 
the equation more fully in Chapter 8. 

We can now give a simple expression for all the units in a real 
quadratic field. In fact consider the set of all units in the field 
that exceed 1. The set is not empty, for if 77 is the unit obtained 
above then one of the numbers ±77 or ±1/77 is a member. Further, 
each element of the set has the form u + vy/d, where u, v are 
either integers, or, in the case d » 1 (mod 4), possibly halves of 
odd integers. Furthermore u and v are positive, for u + v/d is 
greater than its conjugate u-vy/d, which lies between 0 and 1. 
It follows that there is a smallest element in the set, say e. Now 
if e' is any positive unit in the field then there is a unique integer 
m such that e m <e'<e’ n+, ; this gives \ <e'/e m <e. But e'/e m 
is also a unit in the field and thus, from the definition of e, we 
conclude that e' — e m . This shows that all the units in the-field 
are given by ±e m , where m =0, ±1, ±2 

The results established here for quadratic fields are special 
cases of a famous theorem of Dirichlet concerning units in an 
arbitrary algebraic number field. Suppose that the field k. is 
generated by an algebraic number a with degree n and that 

precisely s of the conjugates a a n of a are real; then 

n = s + 2f, where t is the number of complex conjugate pairs. 
Dirichlet’s theorem asserts that there exist r = $ + f-l funda- 
mental units Ci,...,e r in k such that every unit in k can be 
expressed uniquely in the form pe,'"' • • • e r m % where m,, . . . , m r 
are rational integers and p is a root of unity in k. 

4 Primes and factorization 

Let R be the ring of algebraic integers in a quadratic 
field Q(y/d). By a prime n in R we mean an element of R that 
is neither 0 nor a unit and which has the property that if n 
divides a/8, where a, /8 are elements of R, then either n divides 
a or ir divides /8. It will be noted at once that a prime ir is 
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irreducible in the sense indicated in § 1; for if tt = a/3 then either 
a /it or P/tt is an element of i? whence either /3 or a is a unit. 
However, an irreducible element need not be a prime. Consider, 
for example, the number 2 in the quadratic field Q(V(— 5)). It is 
certainly irreducible, for if 2 = aft then 4 = N(a)N(P); but N(a) 
and N(p) have the form x z +5y 2 for some integers x, y , and, 
since the equation x z + 5y z = ±2 has no integer solutions, it fol- 
lows that either N(a) = ± I or N(p) = ±1 and thus either a or P 
is a unit. On the other hand, 2 is not a prime in Q(V(-5)), for it 
divides 

(1 + n/(-5))(1— V<-5)) = 6, 

but it does not divide either l+V(-5) or 1— >/(— 5); indeed, by 
taking norms, it is readily verified that each of the latter is 
irreducible. 

Now every element a of R that is neither 0 nor a unit can be 
factorized into a finite product of irreducible elements. For if a 
is not itself irreducible than a = Py for some P, y in R, neither 
of which is a unit. If P were not irreducible then it could be 
factorized likewise, and the same holds for y. The process must 
terminate, for if a =/3 t • • • /3„, where none of the Ps is a unit, 
then, since |N(/3 i )|>2, we see that |JV(a)|>2". The ring R is said 
to be a unique factorization domain if the expression for a as a 
finite product of irreducible elements is essentially unique, that 
is, unique except for the order of the factors and the possible 
replacement of irreducible elements by their associates. A funda- 
mental problem in number theory is to determine which domains 
have unique factorization, and here the definition of a prime 
plays a crucial role. In fact we have the basic theorem that R is 
a unique factorization domain if and only if every irreducible 
element of R is also a prime in JR. To verify the assertion, note 
that if factorization in R is unique and if it is an irreducible 
element such that tt divides a/3 for some a, P in R then tt must 
be an associate of one of the irreducible factors of a or P and 
so tt divides a or /3, as required. Conversely, if every irreducible 
element is also a prime then we can argue as in the demonstration 
of the fundamental theorem of arithmetic given in § 5 of Chapter 
1; thus if a = ttj • • • irjt as a product of irreducible elements, 
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and if tt' is an irreducible element occurring in another factoriz- 
ation, then tt' must divide rr f for some j, whence it' and n f are 
associates, and assuming by induction that the result holds for 
a/v', the required uniqueness of factorization follows. 

All the imaginary quadratic fields Q (y/d) which have the 
unique factorization property are known; they are given by 
d — — 1, -2, -3, -7, -11,-19, -43, -67 and -163. The theorem 
has a long history, dating back to Gauss, and it was finally proved 
by Baker and Stark, independently, in 1966; the methods of 
proof were quite different, one depending on transcendence 
theory (cf. § 6 of Chapter 6) and die other on the study of elliptic 
modular functions. The theorem shows, incidentally, that the 
nine discriminants d indicated in Exercise (i) of § 7 of Chapter 
5 are the only values for which h(d) = 1. The problem of finding 
all the real quadratic fields Q (\/d) with unique factorization 
remains open; it is generally conjectured that there are infinitely 
many such fields but even this has not been proved. Nevertheless 
all such fields with d relatively small, for instance with d < 100, 
are known; we shall discuss some particular cases in the next 
section. 

5 Euclidean fields 

A quadratic field Q (Jd) is said to be Euclidean if its 
ring of integers i? has the property that, for any elements a, p 
of R with P 5* 0, there exist elements y, 8 of R such that a = Py + 8 
and |N(5)| < \N(0)\. For such fields there exists a Euclidean 
algorithm analogous to that described in Chapter 1. In fact we 
can generate the sequence of equations 8^ 2 - Sj.iyj + Sj (j- 
1,2,...), where S_, = a, 8 0 = P, &\ = 8, y, = y and |N(S,)|< 
|N(5j_i)|; the sequence terminates when <5 fc+ | =0 for some k and 
then 6jt has the properties of a greatest common divisor, that is, 
5^ divides a and /3, and every common divisor of a, p divides 
5 *. Moreover we have 6 * = a\ + pp for some A, fi in R. This can 
be verified either by successive substitution or by observing that 
|iV(5*)| is the least member of the set of positive integers of the 
form |/V(aA — Pfi)\, where A, /jl run through the elements of R. 
In fact the set certainly has a least member |N(6')|, say, where 
8' = a A +Pn for some A, ft in R; thus every common divisor of 
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a, fi divides 5'. Further, d' divides a, since from a — S'y + S", 
with \N(8")\<\N(8')\ t we see that 8"— aA'+fifi' for some A', p,' 
in R, whence N(8 n ) = 0 and so 8* = 0; similarly 8' divides fi. 

Hence we have S' = 5*. It is clear that if 8 k is a unit then, by 
division, we obtain elements A, fi in JR with arA + fi/i = 1. 

We proceed now to prove that a Euclidean field has unique 
factorization. It suffices, in view of §4, to show that every 
irreducible element ir in i? is a prime; accordingly suppose that 
7 r divides afi but that tt does not divide a. Then, by the remarks 
above, there exist integers A, fi in R such that arA + irfi = 1. This 
gives a fiX + irfip = fi, whence ir divides fi. Thus ir is a prime 
and the desired result follows. 

It was proved by Chatland and Davenport in 1950, and 
independently by Inkeri at about the same time, that there are 
precisely 21 Euclidean fields Q Wd); the values of d are given 
by -11, -7, -3, -2, -1, 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, i 

37, 41, 57 and 73. It had been proved earlier by Heilbronn that 
the list must be finite, and it had been verified as a consequence 
of works by Dickson, Perron, Oppenheimer, Remak and Redei 
that the fields listed here are indeed Euclidean; we shall confirm 
the assertion for the first eight fields in a moment. It is easy to 
see that there can be no other Euclidean fields with d<0. In 
fact if d ■ 2 or 3 (mod 4) and d^-5 then we cannot have Vd = 

2y + 8 with |iV(5)|<4; for we can express y, 8 as x + y>/d and 
x'4- y’y/d respectively, where x, y and x', y' are rational integers, ‘ 

and since N(8)*z x' 2 +5y' 2 we would obtain y' = 0, contrary to 
2y+y'*= 1. Similarly if d« 1 (mod 4) and d^-15, then we can- 
not have §(1 +>/d) = 2y + 8 with |N($)|<4. The most difficult 
part of the theorem is the proof that there are no other Euclidean 
fields with d > 0. In this connection, Davenport showed by an ‘ 

ingenious algorithm derived from studies on Diophantine 
approximation that if d>2 14 then Q (>/d) is not Euclidean; this 
reduced the problem to a finite checking of cases. Incidentally, 

Redei claimed originally that the field Q(V97) was Euclidean 
but Barnes and Swinnerton-Dyer proved this to be erroneous. 

We shall show now that if d = — 2, -1, 2 or 3 then Q(Vd) is 
Euclidean. Accordingly let a , fi be any algebraic integers in i 

Q (>/d) with fi r*0. Then a/fi=*u + vy/d for some rationals u , v. 


The Gaussian field 69 

We select integers x, y as close as possible to u, v and put r = u - x, 
s = v-y ; then |r| ^ | and |s| ^ On writing y - x + y>!d we obtain 
a=fiy + 8, where 8 = fi(r + sy/d). This gives N(8) = 
N(fi)(r 2 - ds 2 ). But for \d\ < 2 we have |r 2 - ds 2 \ < r 2 + 2s 2 < if , and 
for d = 3 we have |r 2 -ds 2 |^max(r 2 , ds 2 )^\. Hence |N(8)|< 
|N(/8)|, as required. 

Finally we prove that Q(>/d) is Euclidean when d = -11, -7, 
-3, 5 and 13. In these cases we have 1 (mod 4) and so 1, 
|(l+>/d) is an integral basis for Q (>/d). Again let a, fi be any 
algebraic integers in Q (\/d), with fi 5*0, and let a/fi — u + vJd 
with u, v rational. We select an integer y as close as possible to 
2v and put s = v-\y, then Further we select an integer x 

as close as possible to u-\y and put r— u-x-^y; then |r|^|. 
On writing y = x + |jdl +>/d) we see that a = fiy + 8, where 8 — 
fi(r+s\/d). Now, for |d| ^ 11, we have |r 2 -ds 2 |^^-f {|< 1, and, 
for d = 13, we have jr 2 -ds 2 js]g. The result follows. 

6 The Gaussian field 

To conclude this chapter we shall describe the principal 
properties of the most fundamental quadratic field, namely the 
Gaussian field Q(>/(— 1)) or Q(i). We have already seen that the 
integers in the field, that is, the Gaussian integers, have the form 
x + iy with x, y rational integers. Thus the norm of a Gaussian 
integer has the form x 2 + y 2 , and, in particular, it is non-negative. 
It was noted in §3 that there are just four units ±1 and ±i. 
Moreover we proved in § 5 that the field is Euclidean and so 
has unique factorization. Hence there is no need to distinguish 
between irreducible elements and primes, and we shall use the 
latter terminology in preference; in fact we shall refer to the 
elements as Gaussian primes. 

Our purpose now is to determine all the Gaussian primes. We 
begin with two preliminary observations which actually apply 
analogously to all quadratic fields with unique factorization. 
First, if a is any Gaussian integer and if N(a) is a rational prime 
then at is a Gaussian prime; for plainly if or = fiy for some 
Gaussian integers fi, y then N(a) = N(fi)N(y) and so either 
N(fi) — 1 or N( y)= 1 whence either fi or y is a unit. Secondly 
we observe that every Gaussian prime ir divides just one rational 
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prime p. For 7 r certainly divides JV(tt) and so there is a least 
positive rational integer p such that rr divides p\ and p is a 
rational prime, for if p — mn, where m, n are rational integers, 
then, since it is a Gaussian prime, we have either 7 r divides m 
or 7r divides n whence, by the minimal property of p, either m 
or n is 1. The prime p is unique for if p' is any other rational 
prime then there exist rational integers a, a' such that ap + a'p' — 
1; thus if 7 t were to divide both p and p' then it would divide 
1 and so be a unit contrary to definition. 

We note next that a rational prime p is either itself a Gaussian 
prime or it is the product mr' of two Gaussian primes, where 7 r, 
7 r' are conjugates. Indeed p is divisible by some Gaussian prime 
7 t and thus we have p = 7 tA for some Gaussian integer A ; this 
gives N(n)N(\) = p 2 and the two cases correspond to the 
possibilities N( A)= 1, implying that A is a unit and that p is an 
associate of 7 r, and N( A)= p, implying that N(tt) = p. Now the 
first case applies when p ^ 3 (mod 4) and the second when p s 
1 (mod 4). For JV(tt) has the form x 2 +y z and a square is con- 
gruent to 0 or 1 (mod 4). Further, if p» 1 (mod 4), then —1 is a 
quadratic residue (mod p) whence p divides x 2 + 1 = (x + i)(x -i) 
for some rational integer x; but if p were a Gaussian prime then 
it would divide either x + i or x — i, contrary to the fact that 
neither x/p + i/p nor x/p - i/p is a Gaussian integer. With regard 
to the prime 2, we have 2 = (1 +i)(l -i) and here 1+i and 1-i 
are Gaussian primes and, moreover, associates. Combining our 
results, we find therefore that the totality of Gaussian primes are 
given by the rational primes p * 3 (mod 4), by the factors n, it' 
in the expression p = mr' appertaining to primes p = 1 (mod 4), 
and by 1 + i, together with all their associates formed by multiply- 
ing with ±1 and ±i. The argument here furnishes, incidentally, 
another proof of the result that every prime p s 1 (mod 4) can 
be expressed as a sum of two squares (see § 4 of Chapter 5). 

Many of the definitions and theorems discussed earlier for the 
rational field possess natural analogues in the Gaussian field. 
Thus, for example, one can specify greatest common divisors 
and congruences in an obvious way, and there is an analogue 
of Fermat’s theorem to the effect that if 7r is a Gaussian prime 
and a is a Gaussian integer, with (a, tt)= 1, then a N(w)-1 s* 
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1 (mod 7 t ). There is also, for instance, an analogue of the prime 
number theorem to the effect that the number of non-associated 
Gaussian primes tt with N(n)<x is asymptotic to x/log x as 
x ->oo. 

7 Further reading 

The structure of quadratic fields can be properly appreci- 
ated only in the wider context of algebraic number theory and 
with reference especially to the theory of ideals. The classic text 
in this connection is that of Hecke. It was originally published 
in German in 1923 and it has recently appeared in English 
translation under the title Lectures on the theory of algebraic 
numbers (Graduate texts in Mathematics, Vol. 77, Springer- 
Verlag, Berlin, 1981); it remains one of the best works on the 
subject. There are several newer expositions however. In par- 
ticular the book Algebraic number theory by I. Stewart and D. 
Tall (Chapman and Hall, London, 1979) is relatively elementary 
and easy to read, while the volume with the same title edited 
by J. W. S. Cassels and A. Frohlich (Academic Press, London, 
1967) and that by S. Lang (Addison-Wesley, Beading, Mass., 
1970) include accounts of more advanced topics. Some other 
good works are E. Artin’s Theory of algebraic numbers (Gottin- 
gen, 1956) and W. Narkiewicz’s Elementary and analytic theory 
of algebraic numbers (Polish Acad. Sci., Mon. Mat. 57, Warsaw, 
1974). The book Basic number theory (Springer-Verlag, Berlin, 
1967) by A. Weil covers similar ground but is written on a very 
sophisticated level. 

An account of the solution, mentioned in § 4, to the problem 
of determining all imaginary quadratic fields with unique fac- 
torization can be found in Chapter 5 of Baker’s Transcendental 
number theory (Cambridge U.P., 1979). The work, referred to 
in § 5, of Chatland and Davenport on Euclidean fields appeared 
in the Canadian /. Math . 2 (1950), 289-96; the article is reprinted 
in The collected works of Harold Davenport (Academic Press, 
London, 1978), Vol. I, pp. 366-73. For a proof of the result on 
Gaussian primes cited at the end of § 6 see E. Landau’s Ein- 
fuhrung in die elementare und analytische Theorie der algebrais- 
chen Zahlen und der Ideale (Teubner, Leipzig, 1918). 
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8 Exercises 

(i) Show that the units in Q(>/2) are given by ±(l+>/2) n , 
where n = 0, ±1, ±2, .... Find the units in Q(\/3). 

(ii) Determine the integers n and d for which (1 + 
nyfd)/(\ — n>Id) is a unit in Q(>/d). 

(iii) By considering products of norms, or otherwise, prove 
that there are infinitely many irreducible elements in 
the integral domain of any quadratic field. 

(iv) Explain why the equation 2.11 = (5+V3)(5-V3) is not 
inconsistent with the fact that Q(>/3) has unique 
factorization. 

(v) Prove that the equation 2.3 = (>/( — 6))( — >/( — 6)) implies 
that Q(>/-6) does not have unique factorization. 

(vi) Show that 1+V(-17) is irreducible in Q(V(-17)). 
Verify that Q(\/(-17)) does not have unique 
factorization. 

(vii) Find equations to show that Q Wd) does not have 
unique factorization for d = -10, —13, -14 and -15. 

(viii) By considering congruences mod 5, show that there 

are no algebraic integers in Q(>/10) with norm ±2 and 
±3. Prove that 4+VlO is irreducible in Q(\/l0). Hence 
verify that Q(VlO) does not have unique factorization. 

(ix) Use the fact that Q(>/3) is Euclidean to determine 

algebraic integers or, in Q(\/3) such that (l+2>/3)a + 
(5+4>/3)/3 = l. 

(x) Prove that the primes in Q(>/2) are given by the 

rational primes p 35 ±3 (mod 8), the factors it, it' in the 
expression p = ttit' appertaining to primes p ■ 

±1 (mod 8), and by >/2, together with all their 
associates. 

(xi) Show that if it is a Gaussian prime then the numbers 
1, 2, . . . , N(ir) form a complete set of residues 
(mod tt); that is, show that none of the differences is 
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divisible by tt, but that for any Gaussian integer a 
there is a rational integer a with 1 s a ^ N(n), such 
that tt divides a -a. Apply this result to establish the 
analogue of Fermat’s theorem quoted at the end of 
§ 6 . 
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Diophantine equations 


1 The Pell equation 

Diophantine analysis has its genesis in the fertile mind 
of Fermat. He had studied Bachet’s edition, published in 1621, 
of the first six books that then remained of the famous Arith- 
metical this was a treatise, originally consisting of thirteen books, 
written by the Greek mathematician Diophantus of Alexandria 
at about the third century ad. The Arithmetica was concerned 
only with the determination of particular rational or integer 
solutions of algebraic equations, but it inspired Fermat to initiate 
researches into the nature of all such solutions, and herewith 
the modern theories began. 

An especially notorious Diophantine equation, in fact the issue 
of a celebrated challenge from Fermat to the English 
mathematicians of his time, is the equation 

x 2 -dy 2 = 1, 

where d is a positive integer other than a perfect square. It is 
usually referred to as the Pell equation but the nomenclature, 
due to Euler, has no historical justification since Pell apparently 
made no contribution to the topic. Fermat conjectured that there 
is at least one non-trivial solution in integers x, y, that is, a 
solution other than x = ±1, y = 0; the conjecture was proved by 
Lagrange in 1768. In fact we have already established the result 
in § 3 of Chapter 7; it was assumed there that d is square-free 
but the argument plainly holds for any d that is not a perfect 
square. Now there is a unique solution to the Pell equation in 
which the integers x, y have their smallest positive values; it is 
called the fundamental solution. Let x\ y ' be this solution and 
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put e = x'+t/Vd. Then, by the arguments of §3 of Chapter 7, 
we see that all solutions are given by x + yy/d = ±e n , where 
n = 0, ±1, ±2, .... In particular, the equation has infinitely many 
solutions. 

More insight into the character of the solutions is provided 
by the continued fraction algorithm. First we observe that any 
solution in positive integers x, y satisfies x-yy/d = l/(x + yy/d), 
whence x > yy/d and x-yJd< l/(2yy/d). This gives \y/d-x/y\< 
l/(2y 2 ), and it follows from § 3 of Chapter 6 that x/y is a 
convergent to y/d. Now it was noted in § 4 of Chapter 6 that the 
continued fraction for y/d has the form 

[fl()> ®lt • • • i Ji 

the number m of repeated partial quotients is called the period 
of y/d. Let p„/q„ (n = 1, 2, . . .) be the convergents to y/d and let 
0„(n = 1, 2, . . .) be the complete quotients. We have x = p m y- q„ 
for some n, that is p 2 -dq 2 = 1. Here n must be odd for, by § 3 
of Chapter 6, 

^_ PAu + Pn-l 
Qn@n + 1 4” Qn-1 

whence, on recalling that p n X q n - p n q„-\ = (-l)", we obtain 
qjd - p n = (-1 ) n /(q n 0 „+ i + q n - 1 ), 
and so, for even n, q n y/d>p„. In fact n must have the form 
/m — 1, where / = 1, 2, 3, . . . when to is even and / = 2, 4, 6, . . . 
when to is odd. For the expression for y/d above gives 
( Pn Qn'/di0 n + t — q n _ | y/d ~ P n - 1 , 
and thus 

( p 2 - dq 2 ) 0 „ + , = ( q„ , y/d - p„ _ , )( q„ y/d + p „ ) 

=<-irvd+ c , 

where c is an integer. But p 2 -dq„ 2 = 1 and n is odd; hence 
0„+i=y/d + c. Now y/d = o 0 + 1/0,, where is purely periodic, 
and we have 0 fl + , =a, l4 , + 1/0„+ 2 - Since 0, > 1, 0„ +2 > 1, we obtain 
a„ 4 , =a (t +c and 0| = 0 n+2 ; >t follows that n + 1 is divisible by to 
and so n has the form /to - 1, as asserted. 

We have therefore shown that the only possible positive sol- 
utions x, y to the Pell equation are given by x = p„, y = q„ , where 
Pn/ Qn is a convergent to y/d with n of the form /to - 1 as above. 
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In fact all of these p n , q n satisfy the equation and thus they 
comprise the full set of positive solutions. For, in view of the 
periodicity of \/d we have ^1 — 0 n + 2 for all n — lm - 1 as above, 
and hence 

,j_ Pn + lOl + Pn 
Qn+l + Qn 

But Vd = a 0 + l/0i, and on substituting for and using the fact 
that \!d is irrational, we obtain 

Pn ~ Qn + l ~ Pn+l &0 Pn ~ Qnd- 

On eliminating a 0 we see that 

Pn ~ dq„ = Pnfln+l - P»+lfl , w 

and, since n is odd, it follows that p 2 -dq 2 = 1 , as required. 

A similar analysis applies to the equation x 2 — dy 2 — -1. In this 
case there is no solution when the period m of \!d is even. When 
m is odd, all positive solutions are given by x — p n , y = q nf where 

pjq n is a convergent to >Jd and n = lm — 1 with / = 1 , 3, 5, 

Further, when the equation is soluble, there is a solution in 
positive integers x', y' of smallest value, known as the funda- 
mental solution, and on writing 17 * x’+y'y/d , one deduces that 
all solutions are given by x + y^d = ±Tj n , where n = 
±1, ±3, ±5, . . . ; the result is in fact easily obtained on noting 
that the fundamental solution to x 2 -dy 2 — 1 is given by e = q 2 . 
An analogous result holds for the more general equation x 2 - 
dy 2 = k, where k is a non-zero integer. Here, when the equation 
is soluble, one can specify a finite set of solutions x', y' such 
that, on writing ( - x'+ y'y/d, all solutions are given by x + yJd - 
±(e n with n = 0 , ± 1 , ± 2 , 

As an example, consider the equation x 2 ~97y 2 = -1. The con- 
tinued fraction for V97 is 

[9, 1,5,1,1,1,1,1,1,5,1,18]. 

Thus the period m of >/97 is 11 and, since m is odd, the equation 
is soluble. Indeed the fundamental solution is given by x = pi 0 , 
y ~ Q\o> where pjq n (n = 1, 2, . . .) denote the convergents to \l97. 
Now the first ten convergents to >/97 are 10, t> W, W» W> 
W> W, 15? and Hence the fundamental solution to x 2 - 
97 y 2 = -1 is x = 5604, y = 569. Further, if we write rj = 


5604 + 569^97 then e = q 2 gives the fundamental solution to 
x 2 - 97 y 2 = 1 ; the solution is in fact x = 62 809 633, y = 6 377 352. 
Incidentally, the continued fraction for sld always has the form 

t^O» ®2, ^3» '"1 ^3» ®2> ®lt 2flo], 

as for >/97 above, and moreover the period m of yJd is always 
odd when d is a prime p^l (mod 4). In fact, for such p, the 
equation x 2 -py 2 = - 1 is always soluble. For if x\ y' is the 
fundamental solution to x z -py 2 = 1 then x' is odd and so (x'+ 
1 , x'- 1 ) = 2 ; this gives either x '+ 1 = 2 u z , x '- 1 = 2 pv 2 or x'-l = 
2 u 2 , x'+l = 2 pt > 2 for some positive integers u, v with y' = 2uv, 
whence u 2 - po 2 = ± 1 , and here the minus sign must hold since 
v< y\ 

2 The Thue equation 

A multitude of special techniques have been devised 
through the centuries for solving particular Diophantine 
equations. The scholarly treatise by Dickson on the history of 
the theory of numbers (see § 6 ) contains numerous references to 
early works in the field. Most of these were of an ad-hoc nature, 
the arguments involved being specifically related to the example 
under consideration, and there was little evidence of a coherent 
theory. In 1900, as the tenth of his famous list of 23 problems, 
Hilbert asked for a universal algorithm for deciding whether or 
not an equation of the form /(x,, . . . , x n ) = 0 , where/ denotes a 
polynomial with integer coefficients, is soluble in integers 
Xj, . . . , x„. The problem was resolved in the negative by Matiy- 
asevich, developing ideas of Davis, Robinson and Putnam on 
recursively enumerable sets. The proof has subsequently been 
refined to show that an algorithm of the kind sought by Hilbert 
does not exist even if one limits attention to polynomials in just 
nine variables, and it seems to me quite likely that it does not 
in fact exist for polynomials in only three variables. For poly- 
nomials in two variables, however, the situation would appear 
to be quite different. 

In 1909, a new technique based on Diophantine approxima- 
tion was introduced by the Norwegian mathematician Axel 
Thue. He considered the equation F(x, y) = m, where F denotes 
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an irreducible binary form with integer coefficients and degree 
at least 3, and m is any integer. The equation can be expressed 
as 

a 0 x n + a 1 x n ~ 1 y + • •• + a n y n = m, 
and this can be written in the form 

a 0 (x - «i y) * • * (* - a nV) ~ 

where a u . . . , a„ signify a complete set of conjugate algebraic 
numbers. Thus if the equation is soluble in positive integers x, 
y then the nearest of the numbers a,, . . . , a n to x/y, say a, satisfies 
|x -ay\« 1. Here we are using Vinogradov’s notation; by a« b 
we mean a<bc for some constant c, that is, in this case, a number 
independent of x and y , and similarly by a » b we shall mean 
b<ac for some such c. Now, for y sufficiently large and for 
a 7* at), we have 

|x - at ) yl = j(x - ay) + {ct - a t )y \ » y; 
this gives \x - ay\« \/y n ~ l whence \at- x/y\« l/y n . But by 
Thue’s improvement on Liouville’s theorem mentioned in § 5 
of Chapter 6, we have |a — x/y\» l/y K for any K>|n + 1. It 
follows that y is bounded above and so there are only finitely 
many possibilities for x and y. The argument obviously extends 
to integers x, y of arbitrary sign, and hence we obtain the 
remarkable result that the Thue equation has only finitely many 
solutions in integers. Plainly the condition n^3 is necessary 
here, for, as we have shown, the Pell equation has infinitely 
many solutions. 

The demonstration of Thue just described has a major limita- 
tion. Although it yields an estimate for the number of solutions 
of F(x, y)- m, it does not enable one to furnish the complete 
list of solutions in a given instance or indeed to determine 
whether or not the equation is soluble. This is a consequence 
of the ineffective nature of the original Thue inequality on which 
the proof depends. Some effective cases of the inequality have 
been derived and, in these instances, one can easily solve the 
related Thue equation even for quite large values of m; for 
example, from the result on V2 mentioned in § 5 of Chapter 6 
we obtain the bound (lO^m]) 23 for all solutions of x 3 -2j/ 3 = m. 
But still the basic limitation of Thue’s argument remains. 


Another approach was initiated by Delaunay and Nagell in the 
1920s. It involved factorization in algebraic number fields and 
it enabled certain equations of Thue type with small degree to 
be solved completely. In particular, the method applied to the 
equation x 3 -dy 3 - 1, where d is a cube-free integer, and it 
yielded the result that there is at most one solution in non-zero 
integers x, y. The method was developed by Skolem using analy- 
sis in the p-adic domain, and he furnished thereby a new proof 
of Thue’s theorem in the case when not all of the zeros of F(x, 1) 
are real. The work depended on the compactness property of 
the p-adic integers and so was generally ineffective, but Ljung- 
gren succeeded in applying the technique to deal with several 
striking examples. For instance he showed that the only integer 
solutionsfx, y) of the equation x 3 -3 xy z -y 3 = 1 are(l,0),(0, -1), 
(-1,1), (1,-3), (-3, 2) and (2,1). 

An entirely different demonstration of Thue’s theorem was 
given by Baker in 1968. It involved the theory of linear forms 
in logarithms (see § 6 of Chapter 6) and it led to explicit bounds 
for the sizes of all the integer solutions x, y of F(x, t/) = m; in 
fact the method yielded bounds of the form cjm| c , where c, c' 
are numbers depending only on F. Thus, in principle, the com- 
plete list of solutions can be determined in any particular inst- 
ance by a finite amount of computation. In practice the bounds 
that arise in Baker’s method are large, typically of order 10 ,oSW °, 
but it has been shown that they can usually be reduced to 
manageable figures by simple observations from Diophantine 
approximation. 

3 The Mordell equation 

Some profound results relating to the equation y 2 - 
x 3 +k, where k is a non-zero integer, were discovered by Mordell 
in 1922, and the equation continued to be one of Mordell’s major 
interests throughout his life. The theorems that he initiated 
divide naturally according as one is dealing with integer sol- 
utions x, y or rational solutions. Let us begin with a few words 
about the latter. 

The equation y 2 - x 3 +k represents an elliptic curve in the real 
projective plane. By a rational point on the curve we shall mean 
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either a pair ( x , y ) of rational numbers satisfying the equation, 
or the point at infinity on the curve; in other words, the rational 
points are given in homogeneous co-ordinates by (ac, y, z), where 
Aar, At/, A z are rational for some A. It had been noted, at least by 
the time of Bachet, that the chord joining any two rational points 
on the curve intersects the curve again at a rational point, and 
similarly that the tangent at a rational point intersects again at 
a rational point. Thus, Fermat remarked, if there is a rational 
point on the curve other than the point at infinity, then, by 
taking chords and tangents, one would expect, in general, to 
obtain an infinity of rational points; a precise result of this kind 
was established by Fueter in 1930. It was also well known that 
the set of all rational points on the curve form a group under 
the chord and tangent process (see Fig. 8.1); the result is in fact 
an immediate consequence of the addition formulae for the 
Weierstrass functions x = P(u), y-\P\u) that parameterize the 
curve. Indeed, with this notation, the group law becomes simply 
the addition of parameters. Mordell proved that the group has 
a finite basis, that is, there is a finite set of parameters u x , . . . , u, 
such that all rational points on the curve are given by ti = 
m l u l + ' • ’ + m r u ry where run through all rational 

integers. This is equivalent to the assertion that there is a finite 
set of rational points on the curve such that, on starting from 
the set and taking all possible chords and tangents, one obtains 
the totality of rational points on the curve. The demonstration 
involved an ingenious technique, usually attributed to Fermat, 
known as the method of infinite descent; we shall refer to the 
method again in § 4. The work applied more generally to the 
equation y 2 = x 3 + ax + b with a, b rational, and so, by birational 
transformation, to any curve of genus 1. Weil extended the theory 
to curves of higher genus and the subject subsequently gained 
great notoriety and stimulated much further research. The latter 
has been directed especially to the problem of determining the 
basis elements tij, . . . , u r or at least the precise value of r; this 
is usually referred to as the rank of the Mordell-Weil group. 
There is no general algorithm for determining these quantities 
but they can normally be found in practice. Thus, for instance. 
Billing proved in 1937 that all the rational points on the curve 


y 2 = x 3 - 2 are given by mu t , where u x is the parameter corres- 
ponding to the point (3, 5) and m runs through all the integers. 
Since no multiple of Uj is a period of the associated Weierstrass 
function, it follows that the equation y 2 —x 3 - 2 has infinitely 
many rational solutions. On the other hand, it is known, for 
instance, that the equation y 2 = x 3 +l has only the rational sol- 
utions given by (0, ±1), (-1, 0) and (2, ±3), and that the equation 
y 2 =x 3 -5 has no rational solutions whatever. 



Fig. 8.1. Illustration of the group law on j/ 2 = x 3 +17. 
The points P, Q and P+Q are (2, 5), (}, 3?) and (-2, -3) 
respectively. The tangent at P meets the curve again at 
-2P = (-&, -&). 
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We turn now to integer solutions of y 2 = x 3 + k. Although, 
initially, Mordell believed that, for certain values of k, the 
equation would have infinitely many solutions in integers x, y, 
he later showed that, in fact, for all k, there are only finitely 
many such solutions. The proof involved the theory of reduction 
of binary cubic forms and depended ultimately on Thue's 
theorem on the equation F(x , y) = m. Thus the argument did not 
enable the full list of solutions to be determined in any particular 
instance. However, the situation was changed by Baker’s work 
referred to in § 2. This gave an effective demonstration of Thue’s 
theorem, and, as a consequence, it furnished, for all solutions 
of y 2 =x 3 +k, a bound for |x| and |{/| of the form exp(c|k| c ), 
where c, c f are absolute constants; Stark later showed that c' 
could be taken as 1 + e for any e > 0, provided that c was allowed 
to depend on e. Thus, in principle, the complete list of solutions 
can now be determined for any particular value of k by a finite 
amount of computation. In practice, the bounds that arise are 
too large to enable one to check the finitely many remaining 
possibilities for x and y directly; but, as for the Thue equation, 
this can usually be accomplished by some supplementary analy- 
sis. In this way it has been shown, for instance, that all integer 
solutions of the equation y 2 = x 3 - 28 are given by (4, ±6), (8, ±22) 
and (37, ±225). 

Nevertheless, in many cases, much readier methods of solution 
are available. In particular, it frequently suffices to appeal to 
simple congruences. Consider, for example, the equation y 2 = 
x 3 + 1 1. Since y 2 = 0 or 1 (mod 4), we see that, if there is a solution, 
then x must be odd and in fact (mod 4). Now we have 

x 3 +ll = (x + 3)(x 2 -3x + 9)-16, 

and x 2 — 3x + 9 is positive and congruent to 3 (mod 4). Hence 
there is a prime p ■ 3 (mod 4) that divides t/ 2 + 16. But this gives 
y 2 = -16 (mod p), whence (t/z) 2 * -1 (mod p), where z = *( p+ 1). 
Thus - 1 is a quadratic residue (mod p), contrary to § 2 of Chapter 
4. We conclude therefore that the equation y 2 = x 3 + 11 has no 
solution in integers x, y. Several more examples of this kind are 
given in Mordell’s book (see § 6). 

Another typical method of solution is by factorization in 
quadratic fields. Consider the equation y 2 = x 3 — 11. Since y 2 * 0, 
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1 or 4 (mod 8) we see that, if there is a solution, then x must be 
odd. We shall use the result established in § 5 of Chapter 7, that 
the field Q(\/(— 11)) is Euclidean and so has unique factorization. 
We have 

(j/ + n/(-11))(i/->/(-11)) = x 3 , 

and the factors on the left are relatively prime; for any common 
divisor would divide 2\/(— 11), contrary to the fact that neither 

2 nor 1 1 divides x. Thus, on recalling that the units in 0(>/( — 11)) 
are ±1, we obtain j/ + \/(-l 1) = ±a> 3 and x = N((o) for some alge- 
braic integer o> in the field. Actually we can omit the minus sign 
since —1 can be incorporated in the cube. Now, since —11®1 
(mod 4), we have = a + |h( 1 + \/( - 1 1 )) for some rational integers 
a, h. Hence, on equating coefficients of >/(-ll), we see that 

i=3( 0 +ih) 2 ah)-n(ih) 3 , 

that is (3a 2 + 3 ah - 2 b 2 )b - 2. This gives h = ± 1 or ±2, and so the 
solutions (a, b) are(0, -1), (1, -1), (1, 2) and (-3, 2). But we have 
x. = a 2 ± ab ±3 b 2 . Thus we conclude that the integer solutions of 
the equation t/ 2 = x 3 -ll are (3, ±4) and (15, ±58). A similar 
analysis can be carried out for the equation y 2 = x 3 + k whenever 
Q (>/k) has unique factorization and k * 2, 3, 5, 6 or 7 (mod 8). 

Soon after establishing his theorem on the finiteness of the 
number of solutions of ty 2 = x 3 +k, Mordell extended the result 
to the equation 

y 2 = ax 3 + bx 2 + cx + d, 

where the cubic on the right has distinct zeros; the work again 
rested ultimately on Thue’s theorem but utilized the reduction 
of quartic forms rather than cubic. In a letter to Mordell, an 
extract from which was published in 1926 under the pseudonym 
X, Siegel described an alternative argument that applied more 
generally to the hyperelliptic equation y 2 = /(x), where / denotes 
a polynomial with integer coefficients and with at least three 
simple zeros; indeed it applied to the superelliptic equation 
y m — /(*), where m is any integer >2. The theory was still further 
extended by Siegel in 1929; in a major work combining his 
refinement of Thue’s inequality, referred to in § 5 of Chapter 6, 
together with the Mordell-Weil theorem, Siegel succeeded in 
giving a simple condition for the equation /(x, t/) = 0, where / 
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is any polynomial with integer coefficients, to possess only 
finitely many solutions in integers x, y. In particular he showed 
that it suffices if the curve represented by the equation has genus 
at least 1. The result was employed by Schinzel, in conjunction 
with an old method of Runge concerning algebraic functions, 
to furnish a striking extension of Thue’s theorem; this asserts 
that the equation F(x , y)=G(x , y) has only finitely many sol- 
utions in integers x, y, where F is a binary form as in § 2, and 
G is any polynomial with degree less than that of F. 

4 The Fermat equation 

In the margin of his well-worn copy of Bachet's edition 
of the works of Diophantus, Fermat wrote ‘It is impossible to 
write a cube as the sum of two cubes, a fourth power as the sum 
of two fourth powers, and, in general, any power beyond the 
second as the sum of two similar powers. For this I have dis- 
covered a truly wonderful proof but the margin is too small to 
contain it’. As is well known, despite the efforts of numerous 
mathematicians over several centuries, no one has yet succeeded 
in establishing Fermat's conjecture and there is now considerable 
doubt as to whether Fermat really had a proof. 

Many special cases of Fermat’s conjecture have been estab- 
lished, mainly as a consequence of the work of Kummer in the 
last century. Indeed, as mentioned in Chapter 7, it was Kummer’s 
remarkable researches that led to the foundation of the theory 
of algebraic numbers. Kummer showed in fact that the Fermat 
problem is closely related to questions concerning cyclotomic 
fields. The latter arise by writing the Fermat equation x" 4- y n — z n 
in the form 

(x + y)(x + (y)---(x + y) = z", 

where £ is a root of unity. As we shall see in a moment, the case 
n = 4 can be readily treated; thus it suffices to prove that the 
equation has no solution in positive integers x , y, z when n is 
an odd prime p. The factors on the left are algebraic integers in 
the cyclotomic field Q(£) and, when p ^ 19, the field has unique 
factorization; it is then relatively easy to establish the result. 
Kummer derived various more general criteria. In particular, he 


introduced the concept of a regular prime p and proved that 
Fermat’s conjecture holds for all such p; a prime is said to be 
\ regular if it does not divide any of the numerators of the first 

|(p- 3) Bernoulli numbers, that is, the coefficients B f in the 
equation 

</(e‘-l) = l-i<+ l (-iy 'B f t 2) /(2j)l 
i 

Kummer also established the result for certain classes of irregular 
primes. Thus, in particular, he covered all p<100; there are 
only three irregular primes in the range and these he was able 
to deal with separately. The best result to date arising from this 
approach was obtained by Wagstaff in 1978; by extensive compu- 
tations he succeeded in establishing Fermat’s conjecture for all 
p< 125 000. 

' Before the work of Kummer, Fermat’s equation had already 

been solved for several small values of n. The special case 
x 2 +y 2 = z 2 dates back to the Greeks and the solutions (x, y, z) 
in positive integers are called Pythagorean triples. It suffices to 
determine all such triples with x, t/, z relatively prime and with 
y even; for if x and y were both odd, we would have z 2 * 
2 (mod 4) which is impossible. On writing the equation in the 
form (z + x)(z-x) = y 2 , and noting that (z + x, z-x) = 2, we 
obtain z + x = 2a 2 , z-x = 2b 2 and y = 2ab for some positive 
' integers a, b with (a, b)= 1. This gives 

x = a 2 - b z , y = 2ab, z~a 2 +b 2 . 

Moreover, since z is odd, we see that a and b have opposite 
parity. Conversely, it is readily verified that if a, b are positive 
integers with ( a , b) = 1 and of opposite parity then x, y, z above 
furnish a Pythagorean triple with (x, y, z)- 1 and with y even. 
Thus we have found the most general solution of x 2 + y 2 = z 2 . 
The first four Pythagorean triples, that is, with smallest values 
of z, are (3,4,5), (5, 12, 13), (15,8, 17) and (7, 24,25). 

The next simplest case of Fermat’s equation is x 4 +t/ 4 = z 4 . 
This was solved by Fermat himself, using the method of infinite 
descent. He considered in fact the equation x 4 + y 4 = z 2 . If there 
is a solution in positive integers, then it can be assumed that x, 
y , z are relatively prime and that y is even. Now (x 2 , y 2 ,z) is a 
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Pythagorean triple and there exist integers a, b as above such 
that x 2 = a 2 -b 2 , y 2 = 2ab and z = a 2 + b 2 . Further, b must be 
even, for otherwise we would have a even and b odd, and so \ 

x 2 = -1 (mod 4), which is impossible. Furthermore, ( x , b, a) is a 
Pythagorean triple. Hence we obtain x = c 2 —d 2 , b = 2cd and 1 

a = c 2 +d 2 for some positive integers c, d with (c, d)= 1. This ! 

gives y 2 = 2ab = 4cd(c 2 + d 2 ). But c, d and c 2 + d 2 are coprime 
in pairs, whence c-e 2 , d^f 2 and c 2 + d 2 = g 2 for some positive 
integers e, /, g. Thus we have e 4 +/ 4 = g 2 and g^ g 2 = a^ a 2 < z. 

It follows, on supposing that z is chosen minimally at the outset, 
that the equation x 4 + y 4 - z 2 has no solution in positive integers 
x, y, z. 

The first apparent proof that the equation x 3 + y 3 = z 3 has no 
non-trivial solution was published by Euler in 1770, but the 
argument depended on properties of integers of the form a 2 + 3b 2 , 

and there has long been some doubt as to its complete validity. 

An uncontroversial demonstration was given later by Gauss using 

properties of the quadratic field Q(\/(-3)). The proof is another 

illustration of the method of infinite descent. By considering , 

congruences (mod A 4 ), where A is the prime |(3 — \/(-3)) in 

Q(\/(-3)), it is readily verified that if the equation x 3 +y 3 = z 3 

has a solution in positive integers, then one at least of x, y , z is 

divisible by A. Hence, for some integer n^2, the equation 

a 3 + p 3 + rj\ 3n y 3 — 0 has a solution with tj a unit in Q(V(— 3)) and , 

with a, /3, y non-zero algebraic integers in the field. It is now 

easily deduced, by factorizing a 3 + /? 3 , that the same equation, 

with n replaced by n - 1, has a solution as above, and the desired 

result follows. The equation x 5 + y 5 = z 5 was solved by Legendre 

and Dirichlet about 1825, and the equation x 7 + y 7 = z 7 by Lame ( 

in 1839. By then, however, the ad-hoc arguments were becoming 

quite complicated and it was not until the fundamental work 

of Kummer that Fermat’s conjecture was established for 

equations with higher prime exponents. 

Numerous results have been obtained concerning special 
classes of solutions. For instance, Sophie Germain proved in 
1823 that if p is an odd prime such that 2p+l is also a prime 
then the ‘first case’ of Fermat’s conjecture holds for p, that is, 
the equation x v + y v -z p has no solution in positive integers 


with xyz not divisible by p. Further, Wieferich proved in 1909 
that the same conclusion is valid for any p that does not satisfy 
the congruence 2 P_I 55 1 (mod p 2 ). These results were greeted 
with great admiration at the times of their discovery. The latter 
condition is not in fact very stringent; there are only two primes 
up to 3 • 10 9 that satisfy the congruence, namely 1093 and 3511. 
In another direction, an important advance has recently been 
made by Fallings using algebraic geometry; he has confirmed a 
long-standing conjecture of Mordell that furnishes, as a special 
case, the striking theorem that, for any given n^4, there are 
only finitely many solutions to the Fermat equation in relatively 
prime integers x , y , z. Nonetheless, Fermat’s conjecture remains 
open, and the Wolfskehl Prize, offered by the Academy of Scien- 
ces in Gottingen in 1908 for the first demonstration, still awaits 
to be conferred. The prize originally amounted to 100 000 DM, 
but, alas, the sum has been much eroded by inflation. 

5 The Catalan equation 

In 1844, Catalan conjectured that the only solution of 
the equation x p -tj q = 1 in integers x, y , p, q, all >1, is given by 
3 2 — 2 3 = 1 . The conjecture has not yet been established but a 
notable advance towards a demonstration was made by Tijdeman 
in 1976. He proved, by means of the theory of linear forms in 
logarithms (see § 6 of Chapter 6), that the equation has only 
finitely many integer solutions and all of these can be effectively 
bounded. Thus, in principle, Tijdeman’s work reduces the prob- 
lem simply to the checking of finitely many cases; however, at 
present, the bounds furnished by the theory are too large to make 
the computation practical. 

To illustrate the approach, let us consider the simpler equation 
ax n - by" = c, where a, b and r^0 are given integers, and we 
seek to bound all the solutions in positive integers x, y and n ^ 3. 
We can assume, without loss of generality, that a and b are 
positive, and it will suffice to treat the case y^x. The equation 
can be written in the form e' - 1 = c/(by n ), where 

A = log (a/ b)+n log (x/ y). 

Now we can suppose that y n > (2c) 2 , for otherwise the solutions 
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can obviously be bounded; then we have |e A - 1|< 1/(2 y in ). But, 
for any real number u, the inequality |e“-l|<g implies that 
|u|<2|e“-l|. Hence we obtain |A| < f/ _4n , that is log|A|< 
” 2 ** log y ■ On the other hand, by the theory of linear forms in 
logarithms, we have log |A| » -log n log y, where the implied 
constant depends only on a and b. Thus we see that §n « log n, 
whence n is bounded in terms of a and b. The required bounds 
for x and y now follow from the effective result of Baker on the 
Thue equation referred to at the end of § 2. 

The work of Tijdeman on the equation x p - y q = 1 runs on the 
same lines. One can assume that p, q are odd primes and then, 
by elementary factorization, one obtains * = kX q + 1, y = IY P - 1 
for some integers X, Y, where k is 1 or 1/p and l is 1 or l/q. 
Plainly we have |p log x-q log y\« y~ q , and substituting for x 
and y on the left yields a linear form 

A = p log k-q log l + pq log ( X/Y ) 
for which |A| is small; similar forms arise by substituting for just 
one of x and y. The theory of linear forms in logarithms now 
furnishes the desired bounds for p and q, and those for x an^l y 
then follow from an effective version of the result on the superel- 
liptic equation referred to in § 3. 

Several instances of Catalan’s equation were solved long before 
the advent of transcendence theory. Indeed, in the Middle Ages, 
Leo Hebraeus had already dealt with the case x = 3, y = 2 and, 
in 1738, Euler had solved the case p = 2, q — 3. The case q = 2 
was treated by V.A. Lebesgue in 1850, the cases p = 3 and q = 3 
by Nagell in 1921, the case p = 4 by S. Selberg in 1932 and the 
case p = 2, which includes the result for p = 4, by Chao Ko in 
1967. Moreover Cassels proved in 1960 that if p, q are primes, 
as one can assume, then p divides y and q divides x. Let us 
convey a little of the flavour of these works by proving that the 
equation x 5 — y 2 = 1 has no solution in integers except x = 1, y — 0. 
We shall use the unique factorization property of the Gaussian 
field. Clearly, since t/ 2 — 0 or 1 (mod 4), we have x odd and y 
even. The equation can be written in the form x 5 = (l +it/)(l - it/) 
and, since x is odd, the factors on the right are relatively prime. 
Thus we have l + iy = ecu 5 , where e is a Guassian unit and cu is 
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a Gaussian integer. Now e = ±1 or ±i, and so e = e 5 . Hence, on 
writing ecu = u+io, where u, v are rational integers, we have 
l + ij/ = (ti + ic) 5 .ThisgivesM 5 - 10u 3 o 2 + 5ut? 4 = 1, whence u = ±1 
and 1 - 10t) 2 +5o 4 = ±1. It follows easily that u = 1, v = 0 and so 
x = 1, y = 0, as required. The argument can readily be extended 
to establish Lebesgue’s more general theorem that, for any odd 
prime p, the equation x p -{/ 2 = 1 has no solution in positive 
integers. 

A particularly striking result on an exponential Diophantine 
equation rather like those of Fermat and Catalan was obtained 
by Erdos and Selfridge in 1975; they proved that a product of 
consecutive integers cannot be a perfect power, that is, the 
equation 

y n = x(x + l)* ••(x + m-1) 
has no solution in integers x, y, m, n, all >1. 

6 Further reading 

As remarked in § 2, an excellent source for early results 
on Diophantine equations is Dickson’s History of the theory of 
numbers (Washington, 1920; reprinted Chelsea Publ. Co., New 
York). Another good reference work is Skolem’s Diophantische 
Gleichungen (Springer-Verlag, Berlin, 1938). 

For general reading, Mordell’s Diophantine equations 
(Academic Press, London and New York, 1969) is to be highly 
recommended; the author was one of the great contributors to 
the subject and, as one would expect, he covers a broad range 
of material with clarity and considerable skill. 

As regards special topics, Ribenboim's 13 Lectures on Fermat’s 
last theorem (Springer-Verlag, Berlin, 1979) is a very fine mono- 
graph. There is also a good discussion of the Fermat equation 
in Borevich and Shafarevich Number theory (Academic Press, 
New York and London, 1966). Several other books on number 
theory contain valuable sections on Diophantine equations; this 
applies especially to Nagell’s Introduction to number theory 
(Wiley, New York, 1951). For an account of recent results on 
the effective solution of Diophantine equations see Baker’s Tran- 
scendental number theory (Cambridge U.P., 2nd edn, 1979). 
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The theorem of Schinzel referred to at the end of § 3 appeared 
in Comment. Pontificia Acad. Sci. 2 (1969), no. 20, 1-9. The 
theorem of Faltings referred to at the end of § 4 appeared in 
Invent. Math. 73 (1983), 349-66. The theorem of Erdos and 
Selfridge referred to at the end of § 5 appeared in the Illinois J. 
Math. 19 (1975), 292-301. 

7 Exercises 

(i) Prove that, if ( x„ , y n ), with n = 1, 2, . . . , is the 

sequence of positive solutions of the Pell equation 
x 2 -dy 2 = 1, written according to increasing values of 
x or y, then x n and y n satisfy a recurrence relation 
u n+2 -2au n+l + u n = 0, where a is a positive integer. 
Find a when d = 1. 

(ii) Determine whether or not the equation x 2 -31t/ 2 = -1 
is soluble in integers x, y. 

(iii) Show that if p, q are primes s 3 (mod 4) then at least 
one of the equations px 2 -qy 2 -± 1 is soluble in 
integers x, y. 

(iv) Prove, by congruences, that if a , c are integers with 
a > 1, c > 1 and a + c s 16, then the equation x 4 - 
ay 4 -c has no solution in rationals x, y. 

(v) Show that the equation x 3 + 2e/ 3 = 7(z 3 +2o> 3 ) has no 
solution in relatively prime integers x, y f z, w. 

(vi) By considering the intersection of the quartic surface 
x 4 + y 4 4- z 4 = 2 with the line y = z-x = l — tx, where t 
is a parameter, show that the equation x 4 +y 4 + z 4 = 

2u> 4 has infinitely many solutions in relatively prime 
integers x, y, z, w. 

(vii) Solve the equation y 2 = x 3 - 17 in integers x, y by 
considering the factors of x 3 + 8. 

(viii) Solve the equation y 2 =x 3 - 2 in integers x, y by 
factorization in Q(\/(-2)). 

(ix) Prove, by the method of infinite descent, that the 
equation x 4 — y 4 = z 2 has no solution in positive 
integers x, y, z. 
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(x) Prove that the equation x 4 -3|/ 4 = z 2 has no solution in 
positive integers x, y, z. Deduce that the equation x 4 + 
y 4 = z 3 has no such solution with (x, y) = 1. (For the 
first part see Pocklington, Proc. Cambridge Phil. Soc. 
17 (1914), 108-21.) 

(xi) By considering (x + l) 3 +(x - 1) 3 , show that every 

integer divisible by 6 can be represented as a sum of 
four integer cubes. Show further that every integer 
can be represented as a sum of five integer cubes. 

(xii) Prove, by factorization in Q(>/( — 7)), that the equation 
x 2 +x+2= y 3 has no solution in integers x, y except 
x = 2, y « 2 and x = -3, y - 2. Verify that the equation 
x 2 +7 = 2 3 * +2 has no solution in integers k, x with k> 
1. (This is a special case of a conjecture of Ramanujan 
to the effect that the equation x 2 +7 = 2" has only the 
integer solutions given by n = 3, 4, 5, 7 and 15. The 
conjecture was proved by Nagell; for a demonstration 
see Mordell’s book, page 205.) 
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